SHAP for interpreting ML models explained with codes

interpreting a baseline Neural Network

Published in

Data Science in your pocket

5 min readOct 31, 2022

After covering LIME, we would be covering SHAP aka Shapley Additive exPlanations agnostic method to interpret machine learning models.

agnostic: Methods that can interpret both white & black box models

We will first walk through the maths behind SHAP followed by a baseline example of how to use SHAP as we did for LIME. Some of the steps are also very similar to LIME. Do checkout LIME here

Enough talking, time for some action !!

We would again be interpreting a shallow Neural Network for binary classification with 4 features (as we did for LIME)

Pick up a random sample from the training/validation dataset, say X.
Generate an artificial dataset using the random sample picked in the previous step.

How?

Here comes the tricky part where SHAP is completely different from LIME.

In LIME, we generate the artificial dataset using summary stats over the entire training dataset. In SHAP, we follow the ideology of coalition coming from Game Theory.

What’s that?

What we do is of all the features present in the training set, we set them ‘On’ & ‘Off’ randomly and generate new samples where if a feature is

ON: Take the same value as in X for that feature
OFF: Pick up some random value from training dataset for that feature.

So, if the random sample X we picked has the following values

A: 10, B:20, C:30, D:40

And we kept

ON: A & D
OFF: B & C

Then, the new sample may look something like this

A:10, B:60, C:100, D:40

Where B & C got some random value from the training dataset. As you must have noticed, A & D have the same values as in X but not B & C.

As shown in the above example, we will be generating multiple samples following all permutations & combinations by switching OFF & ON different features.

Once the artificial dataset is generated, we will be assigning weights to different samples as we did in LIME but with a twist.

This time, the weights assigned won’t be based on any relation with the random sample X but a new kernel, SHAP Kernel which assigns more weightage to samples which has low entropy in terms of ON & OFF features i.e. either highly imbalanced ONs and OFFs. So, a sample (assume 5 features in the dataset) with

weightage 4 OFFs+ 1 ON > weightage 3 OFFS + 2 ONs
weightage 3 ONs + 2 OFFs = weightage 3 OFFS + 2 ONs
weightage 5 ONs + 0 OFFs > 4 OFFS + 1 ON

Similar to LIME, train a white-box model on the weighted artificial samples created in the previous step
Interpret this white-box model

As simple as that

So, as you must have noticed, SHAP has almost the same implementation as LIME with a couple of major changes

Artificial dataset generation
Weighing the artificial dataset

Rest everything remains same.

Let’s quickly implement SHAP over the same dataset as we used for LIME’s demonstration

Loading dataset and training a baseline neural network

%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Denseimport shap 
shap.initjs()df = pd.read_csv('abc.csv')
test = pd.read_csv('test.csv')target = df.pop('Target')def create_baseline():    # create model
    model = Sequential()
    model.add(Dense(60, input_shape=(4,), activation='relu'))
    model.add(Dense(30, input_shape=(60,), activation='relu'))
    model.add(Dense(1, activation='sigmoid'))    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return modelmodel = create_baseline()
model.fit(df,target,epochs=100)

Skipping the explanation for the above code snippet as toward training a neural network for a tabular dataset

2. Create a function to return probability as a Nx2 numpy array (N=samples, 2= probability for the 2 classes)

def return_prob(data):
    p1,p2 = (np.ones(len(data))[0] - model.predict(data)),model.predict(data)
    prediction = [[x,y] for x,y in zip(p1,p2)]
    return np.array(prediction).reshape(len(data),2)

As we did in LIME, similarly we need a function that returns a N X M np.array for probabilities of different classes where N= total samples, M=total labels

3. Create SHAP object and get an interpretation for the test dataset

shap_explainer = shap.KernelExplainer(return_prob, shap.kmeans(df.to_numpy(),10))shap_exp = shap_explainer.shap_values(test.to_numpy())

In the above code snippet,

We create a SHAP object using KernelExplainer (this uses SHAP Kernel for assigning weights). Two params are passed to it, return_prob() and a summarized version of the dataset using shap.kmeans. This has been done to reduce computation time
shap_explainer.shap_values() helps us to calculate the Interpretation using the test dataset

4. Plot visualization for 2 samples, one for positive & other for negative class (indexes for which are known beforehand)

plot1 = shap.force_plot(shap_explainer.expected_value[1], 
                       shap_exp[0][0,:], 
                       test.to_numpy()[0,:],
                       feature_names=df.columns)plot2 = shap.force_plot(shap_explainer.expected_value[0], 
                       shap_exp[0][1,:], 
                       test.to_numpy()[1,:],
                       feature_names=df.columns)

Let’s see the visualizations. They are very different from what we saw in LIME

Sample with label = 0

In the above plot

Model prediction=0.44 (bold; black). Hence, the model predicts it to belong to class 0.
The red bar is pushing the probability towards 1 while the blue bar towards 0. As the influence of ‘A’ is higher (longer bar) for this sample, the final prediction is towards 0.
The two classes, B & C have no effect at all.

Sample with label=1