Machine Learning Model Interpretability with Python.

A Comprehensive Hands-On Guide

Oussama Errabia

Published in

Analytics Vidhya

9 min readAug 15, 2019

Introduction:

The potentials of Machine learning for improving any business is no longer a secret. However, the models that we build, most of the times do not explain why they made a certain predictions for a given sample, which for some businesses, is a barrier to the adoption of an AI driven strategy.

In my previous Article we discussed a 15 Things you should know before building your model in a Production Environment. One of those keys highlighted the importance of machine learning models interpretation, and how crucial it can be in some businesses.

Given a machine learning model prediction, the “Why” question is getting more and more a necessary thing to answer, and for this sake, tools and packages are being developed to turn a machine learning model from a black-box into a white-box.

In this article, we will be teaching the use of a tool called SHAP to explain the predictions of a black-box model. The data used in this tutorial is a real-world dataset know as the the Adult dataset available in the UCI Repo and it is the data is about predicting if the potential income a person is more than $50K/yr or it is less, so let us get started.

Loading and Inspecting the data :

First we will load the necessary libraries and the data we will use in this tutorial :

After loading the data, let us have a first look on it:

Now, what about our target feature:

target : {False: 24720, True: 7841}

So our target is binary, with either True or False values, True means the person income is more than USD 50K and false mean less than USD 50K .

And how about our predictors(features)? Let us inspect their types:

As you can see, we are dealing with 2 types of features: numerical and categorical features. For the numerical features, they are all set. As for the categorical features, we have to pre-process them first(encode them).

For this project, we will keep things simple and just label encode them, however there are other ways for dealing with categorical features, like embeddings or target encoding. For a more clear picture on that you can check my tutorial: tensorflow 2 tutorial on categorical features-embedding

Encoding categorical features

So let us start encoding :

As you can notice, we have created a LabelEncoder for each categorical feature, fit it with that feature, and then transformed the feature to integers (encoded it). We also saved the encoder in a dictionary(good practice to keep them around).

Now that our data is ready to go, let us move on to training our LightGBM model.

First we have to create a validation set, so we can assess the performance of our trained model(we don’t want to spend time interpreting a model that has no value and poor scores).

We used train_test_split function from sklearn.model_selection to split our data into a train and a val sets. Now we can move to fit our model.

We start by defining our LightGBM model’s parameters :

Now fit :

Evaluation:

Let us observe our Confusion Matrix and see how our model is doing :

Let us compute our validation scores :

The recall and precision for the positive class (more than 50k $):

Advice: It is always best practice to compute the scores manually rather than always relaying on a function, the more you do them from scratch the more you will Understand them, and Understanding your metrics is a crucial thing.

1 — Recall = TP / (TP + FN) = 1629/(1629+680) = 0.705 (70% recall)

Interpretation: The model is able to predict 70% of the people that belongs to the positive class (>50k$)

2 — Precision = TP/ (TP+FP) = 1629/(1629+566) = 0.74 (74% precision).

Interpretation: the model is 74% accurate, which mean every predicted true positive has a probability of 74% to be actually true positive.

The scores are relatively good for an unbalanced data set. For this case, both classes are more or less equally important. On the other hand, when in other cases where the positive class is much more important, there is some tweaking to be done in order to favor the recall over the precision, but that is another tutorial for another article which will be published in the near future.

Model Diagnostic

Understanding SHAP :

Before we start interpreting our model, let us first explain the tool we will use in the study: SHAP.

So how does SHAP work ? Let me explain(of course without complicating things):

First there is a term we need to explain, which is the base value, so what is it?

The base value is used by SHAP internally, and it is simply the mean of the predicted probabilities over the train set:

base_value = np.mean(lgbm_model.predict_proba(train)[:,1])

Now that we know what is the base value, let us dig into SHAP :

Let us say the model predicted 0.6 for a person to have more than USD 50K, what SHAP does is decompose the (0.6-base value) probability across all the features. Similarly, we can visualize which features contributed the most to that probability, and also do keep in mind, in that decomposition, some features can have positive values, and some other features can have negative values:

1 — Positive value means that feature increases the probability of that instance toward the positive class (pushing the instance to be a positive) .

2 — Negative value means that feature decreases the probability of that instance toward the negative class (pushing the instance to be a negative).

Now that we understand how SHAP works, let us start understanding what drives our model to give a specific prediction to a given data point using it.

Global Model Interpretation with SHAP:

First we will observe the global interpretation of the model by visualizing the features impact on the predicted probabilities given by the SHAP for a given number of samples :

Remark: make sure to set the model_output to ‘probability’ so what you will be observing are actually probabilities

So how do you read that graph ? Let me explain :

First keep in mind:
>> that the x represent the shap values,
>>the points represent the 10 samples (each sample is in a color)

The above graph can be described as this :

First — The features are ranked from top to bottom, higher features means have higher impact.(they are ranked by the mean of the absolute value of there shap values across the 10 instances we used)

Second — The set of features that we have makes different impact on the different samples, sometimes they contribute by increasing the probability, while other times they contribute by decreasing it.

Third — By visualizing more instances we can have a global sense of some features and their predictive power and which class they usually push toward. Similarly, we can double check whether what the model learned about those features does make sense.

Local Model Interpretation with SHAP:

Let us do a local explanation of the model, in other words, at the sample level:

What we will do is this —

We will take a random sample(a person),
Inspect it and it’s true class,
Inspect the class predicted by that model,
Visualize the SHAP explanation only for that instance,
Make a conclusion by comparing the true class, the prediction, and what shap says.

So let us start.

Let’s take for example this person:

Age                  32.0
Workclass             4.0
Education-Num        13.0
Marital Status        2.0
Occupation           13.0
Relationship          0.0
Race                  4.0
Sex                   1.0
Capital Gain      15024.0
Capital Loss          0.0
Hours per week       40.0
Country              39.0
Name: 2631, dtype: float64

Our model has predicted that this person makes an income more than USD 50K per year, which is, based on the true label, a correct prediction, so let us inspect how our features contributed to this prediction :

So as you can see, the base value (the model starting point) is 0.26, so shap will decompose 1–0.263=0.737 over all the features. Additionally, as you can see Capital Gain is the biggest contributor to that probability with a value of +0.49, which make sense given the high capital gain of 15024. Also the Age of 32 also make sense, also the Education-Num.

Conclusion: for this instance the prediction is correct, and given the features values everything make sense, so the model is doing a great job.

However, based on my experience, true insight comes when analyzing mislabeled instances. So let us do some true inspection and have a look at a False Negative sample(a sample that should have class positive but the model predicted a class negative) and see if there is something to interpret about the model.

We will extract all the false negative samples using the following code :

Now let us inspect an instance of those false_negative instances:

Reminder : for this person the model predicted the class “<USD 50K” but the actual class is “ > USD 50K ”

Okay, very interesting, so let us speak out the above output :

The model has predicted 0.36 probability, 0.263 is coming from the base_value, as for the rest, it is coming from the contribution of the features.

So, as the graph shows, the Marital Status, Age, Relationship and Sex are contributing positively to the prediction, which mean they are pushing that sample in the right direction, which is having a probability >0.5.

In reverse, Capital Gain, education-Num and Occupation are doing the opposite, which is contributing with negative values which lowers the probability.

Now what can we say from this ?

First given the Capital Gain, it make sense to push the instance to be a class negative, since the value of that feature is 0, and since the impact is not big, so I guess it is okay.

Now for the Occupation, the biggest negative contributor, we have a value of 7, and by using our LabelEncoder we get “7 = Machine-op-inspct : Machine Operator Inspector”. Judging from our training data, we have 1387 persons with that occupation in our training data: 1209 of them have a an Income with less than USD 50K and only 178 have an Income with more than USD 50K. Therefore, it make sense that such a basic model with no feature engineering to make such behavior and contribute negatively (pushes toward the negative class) when the occupation takes that value, however, this is a great insight to keep in mind when improving the model.

Et Voila, As you can see, this is why model diagnostic and interpretation is so important, and just to list some of the obvious conclusions :

1 — It makes you Trust the model, as now you are inspecting it’s predictions and you double check if they make sense or not (an evaluation score along is definitely not enough). The same as we did above.

2 — It gives you Insight on future improvements, as it helps you diagnose where the model goes wrong. The same as what we did above.

3 — It gives you predictions reports, something that is a must in some businesses when there is a required action to be done based on a prediction (for example if a model says a client is going to leave, you do want to let the company know what was the reason behind this client’s decision, and this study will help you do that).

So we will stop this tutorial at this point for now, more chapters on the subject will be coming next. SHAP is by no wonder a great tool and has more functionalities to inspect apart from the ones we used above, like dependencies plots which we will try to cover in a future leaning lesson.

I hope you found this tutorial very insightful and practical, and Ihope you are feeling now a better Data scientist after reading all these words.

This is the code to the full Jupyter notebook for reproducing the above results.

Tell next time, meanwhile if you have any questions for me, or you have a suggestion for a future tutorial my contact info are below.

About Me

I am a Principal Data Scientist @ Clever Ecommerce Inc, we help businesses to Create and manage there Google Ads campaigns with a powerful technology based on Artificial Intelligence.

You can reach out to me on Linked In or gmail: errabia.oussama@gmail.com.