A Complete SHAP Tutorial: How to Explain Any Black-box ML Model in Python | by BEXBoost | Oct, 2022

By Jessie Hobb On Oct 11, 2022

Explain any black-box model to non-technical people

Today, you can’t just come up to your boss and say, “Here is my best model. Let’s put it into production and be happy!”. No, it doesn’t work that way now. Companies and businesses are being picky over adopting AI solutions because of their “black box” nature. They demand model explainability.

If ML specialists are coming up with tools to understand and explain the models they created, the concerns and suspicions of non-technical folks are entirely justified. One of those tools introduced a few years ago is SHAP. It can break down the mechanics of any machine learning model and neural network to make them understandable to anyone.

Today, we will learn how SHAP works and how you can use it for classical ML tasks in your practice.

SHAP (SHapley Additive exPlanations) is a Python package based on the 2016 NIPS paper about SHAP values. The premise of this paper and Shapley values comes from approaches in game theory.

One of the questions often posed in games is that in a group of n players with different skillsets, how do we divide a prize so that everyone gets a fair share based on their skill set? Depending on the number of players, their time of joining the game, and their different contributions to the outcome, this type of calculation can become horribly complex.

But what does game theory have to do with machine learning? Well, we could reframe the above question so that it becomes “Given a prediction, how do we most accurately measure each feature’s contribution?” Yes, it is like asking for feature importances of a model, but the answer the Shapley values give is much more sophisticated.

Specifically, Shapley values can help you in:

Global model interpretability — imagine you work for a bank and build a classification model for loan applications. Your manager wants you to explain what (and how) different factors influence the decisions of your model. Using SHAP values, you can give a concrete answer with details of which features lead to more loans and which features lead to more rejections. You make your manager happy because now, he can draw up basic guidelines for future bank customers to increase their chances of getting a loan. More loans mean more money means a happier manager means a higher salary for you.

Local interpretability — your model rejects one of the applications submitted to the bank a few days ago. The customer claims he followed all the guidelines and was sure to get a loan from your bank. Now, you are legally obligated to explain why your model rejected that particular candidate. Using Shapley values, every case can be analyzed independently, without worrying about its connections to other samples in the data. In other words, you have local interpretability. You extract the Shapley values for the complaining customer and show them what parts of their application caused the rejection. You prove them wrong with a plot like this:

So, how do you calculate the mighty Shapley values? That’s where we start using the SHAP package.

The exact mathematical details of calculating Shapley values deserve an article of its own. Therefore, for now, I will be standing on the shoulder of giants and refer you to their posts. They are guaranteed to solidify your understanding of the concepts (1, 2 — by Khuyen Tran).

In practice, however, you will rarely refer to the math behind Shapley values. The reason is that all the magical details are nicely packaged inside SHAP. So, let’s look at our very first example.

Using the Diamonds dataset built into Seaborn, we will be predicting diamond prices using several physical measurements. I processed the dataset beforehand and divided it into train and validation sets. Here is the training set:

>>> X_train.shape, X_valid.shape((45849, 9), (8091, 9))

Cut, color, and clarity are categorical features. They are encoded ordinally as their orders have meaning to the context and, ultimately, the model decision.

As a baseline, we fit an XGBRegressor model and evaluate the performance with Root Mean Squared Error:

Now, let’s finally take a peek behind the curtains and calculate the Shapley values for the training set.

We start by creating an explainer object for our model:

TreeExplainer is a special class of SHAP, optimized to work with any tree-based model in Sklearn, XGBoost, LightGBM, CatBoost, and so on. You can use KernelExplainer for any other type of model, though it is slower than tree explainers.

This tree explainer has many methods, one of which is shap_values:

As I have said, calculating Shapley values is a complex process, which is why it took ~22 mins for just 45k observations on the CPU. For large modern datasets with hundreds of features and millions of samples, the calculation can take days. So, we turn to GPUs to calculate the SHAP values.

As of now, GPU support is not stable in SHAP, but we have a workaround. The predict method of the core XGBoost model has pred_contribs argument, which, when set to True, calculates SHAP values on GPUs:

Note that LightGBM also has GPU support for SHAP values in its predict method. In CatBoost, it is achieved by calling get_feature_importances method on the model with type set to ShapValues.

After extracting the core booster model of XGBoost, it only took about a second to calculate Shapley values for 45k samples:

>>> shap_values_xgb.shape(45849, 10)

But wait — the Shap values from the tree explainer had nine columns; this one has 10! Don’t worry; we can safely ignore the last column for now, as it just contains the bias term which XGBoost adds by default:

We got the Shapley values; now what? Now, we get plottin’.

Let’s see which physical measurements of diamonds are the most important when determining price:

The carat stands out as the driving factor for a diamond’s price. Reading the axis title below, we see that the importances are just the average absolute Shapley values for a feature. We can check that below:

But that’s not much different than the feature importances plot you would get from XGBoost:

>>> xgb.plot_importance(booster_xgb);

That’s where we are wrong. You can’t trust feature importances from XGBoost because they are inconsistent across different calculations. Watch how feature importances change with the calculation type:

In contrast, feature importances obtained from Shapley values are consistent and trustworthy.

We won’t also stop here. In the above plots, we only looked at absolute values of importance. We don’t know which feature positively or negatively influences the model. Let’s do that with SHAP summary_plot:

Here is how to interpret the above plot:

The left vertical axis denotes feature names, ordered based on importance from top to bottom.
The horizontal axis represents the magnitude of the SHAP values for predictions.
The vertical right axis represents the actual magnitude of a feature as it appears in the dataset and colors the points.

We see that as carat increases, its effect on the model is more positive. The same is true for y feature. The x and z features are a bit tricky with a cluster of mixed points around the center.

We can get a deeper insight into each feature’s effect on the entire dataset with dependence plots. Let’s see an example and explain it later:

This plot aligns with what we saw in the summary plot before. As carat increases, its SHAP value increases. By changing the interaction_index parameter to auto, we can color the points with a feature that most strongly interacts with carat:

It seems that the carat interacts with the clarity of the diamonds much stronger than other features.

Let’s now create a dependence plot for categorical features:

This plot also goes in hand with the summary plot. The latest color categories affect the prices negatively while interacting with carat.

I will let you explore the dependence plots for other features below:

One of the most fantastic attributes of SHAP and Shapley values is their ability to find relationships between features accurately. We have already got a taste of that in the last section when SHAP found the most robust interacting feature in dependence plots.

We can go a step further and find all feature interactions ordered by their interaction strength. For that, we need a different set of values — SHAP interaction values.

They can be calculated using the shap_interaction_values of the tree explainer object like so:

but this is even more time-consuming than regular SHAP values. So, we will turn to GPUs once more with XGBoost:

By setting pred_interactions to True, we get SHAP interaction values in only 15 seconds. It is a 3D array, with the last column axes being the bias terms:

Now we got the interactions; what do we do? Frankly, even the SHAP documentation doesn’t outline a reasonable use-case for interactions, but we get help from others. Specifically, we will use a function I have learned from 4x Kaggle Grandmaster Bojan Tunguz to find the most substantial feature interactions in our dataset and plot them:

Now, top_10_inter_feats contains 10 of the strongest interactions between all possible pairs of features:

We can create another function that plots these pairs based on their interaction strengths:

As we can see, the interactions between y and carat is much stronger than others. Even though this plot doesn’t mean anything to us, it might be possible for a domain expert to decipher this relationship and use it to diagnose the model better.

For example, if your model tried to classify molecules’ reactions to different chemical stimuli, a plot like this can be helpful because it might show which chemical properties of the molecules are interacting with the stimulus. This will tell a lot to the domain expert running the experiments since they know what types of chemicals interact and whether the model could capture their behavior.

Finally, we get to the local interpretability section. It is all about explaining why the model got to a particular decision for a sample.

Let’s choose a random diamond and its predicted price to explain:

OK, it looks like we will be looking at the 6559th diamond in the training data. Let’s start:

We first recalculate the SHAP values using the explainer object. This is different than shap_values function, because this time, the Shapley values are returned with a few more properties we need for local interpretability:

>>> type(shap_explainer_values)shap._explanation.Explanation

Now, let’s explain the random diamond we picked out with a waterfall plot:

>>> shap.waterfall_plot(shap_explainer_values[6559])

The E[f(x)] = 3287.856 is the mean prediction of diamond prices for the train set, e.g. preds.mean(). The f(x) = 3214.05 is the predicted price for the diamond.

The thin line in the middle denotes the mean prediction. The vertical axis shows the feature values of the 6559th diamond. The bars represent how each feature property shifted the price from the mean prediction. The red bars represent positive shifts; the blue bars represent negative shifts.

Let’s look at another diamond for completeness:

>>> shap.waterfall_plot(shap_explainer_values[4652])

This diamond is much cheaper than the previous one, mainly because its carat is much lower, as can be seen above.

There is another plot to explain local interpretability. The SHAP calls it force plot, and it looks like this:

This is just an ordered, organized version of waterfall plots. All negative and positive bars are grouped to either side of the predicted price. Again, the base value shows the mean price, and the bars show how much each feature property shifts that value.

>>> shap.force_plot(shap_explainer_values[6559])

Now, you can come up to your boss and say, “Here is my best model, and here is the explanation of why it is the best and how it works!” Hopefully, the response you get will be much more positive. Thank you for reading!