SHAP (SHapley Additive exPlanations) And LIME (Local Interpretable Model-agnostic Explanations) for model explainability.

Published in

Analytics Vidhya

5 min readOct 4, 2020

Why Is Model Interpretability so Important?

Simulated intelligence is mind-blowing in desire precision, measure viability, and assessment proficiency. In any case, computers normally don’t explain their desires. This transforms into a limit to the gathering of AI models. If the customers don’t trust in a model or a desire, they won’t use or send it. Hence the issue is the methods by which to help customers with trusting in a model.

While simpler classes of models (such as linear models and decision trees) are often readily understood by humans, the same is not true for complex models (e.g., ensemble methods, deep neural networks). Such complex models are essentially black boxes for all practical purposes. One way to understanding the behavior of such classifiers is to build simpler explanation models that are interpretable approximations of these black boxes.

To this end, several techniques have been proposed in the existing literature. LIME and SHAP are two popular model-agnostic, local explanation approaches designed to explain any given black-box classifier. These methods explain individual predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model (e.g., linear model) locally around each prediction. Specifically, LIME and SHAP estimate feature attributions on individual instances, which capture the contribution of each feature on the black box prediction. Below, we provide some details of these approaches, while also highlighting how they relate to each other.

What is LIME?

LIME (Local Interpretable Model-agnostic Explanations) is a novel explanation technique that explains the prediction of any classifier in an interpretable and faithful manner by learning an interpretable model locally around the prediction.

What is SHAP?

SHAP — which stands for SHapley Additive exPlanations — is most likely the cutting edge in Machine Learning reasonableness. This calculation was first distributed in 2017 by Lundberg and Lee and it is a splendid method to figure out the yield of any prescient calculation.

SHAP esteems are utilized at whatever point you have a mind-boggling model (could be a gradient boosting, a neural network, or anything that takes some features as input and produces some predictions as output) and you need to comprehend what choices the model is making.

Using the Hotel Reviews classification data set, we are going to build a multi-class text classification model, then applying LIME & SHAP separately to explain the model. Because we have done text classification many times before, we will quickly build the NLP models and focus on the model’s interpretability.

LIME & SHAP help us explain not only to end-users but also to ourselves about how an NLP model works.

Data Pre-processing, Feature Engineering, and Logistic Regression:

The most important part of any machine learning model is to get the best out of the dataset

import pandas as pd
import numpy as np
import sklearn
import sklearn.ensemble
import sklearn.metrics
from sklearn.utils import shuffleimport refrom nltk.corpus import stopwords
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score
import lime
from lime import lime_text
from lime.lime_text import LimeTextExplainer
from sklearn.pipeline import make_pipelinedf = pd.read_csv('hotel-reviews.csv')
df.head()df = df[pd.notnull(df['Browser_Used'])]
df = df.sample(frac=0.5, random_state=99).reset_index(drop=True)
df = shuffle(df, random_state=22)
df = df.reset_index(drop=True)
df['class_label'] = df['Browser_Used'].factorize()[0]
class_label_df = df[['Browser_Used', 'class_label']].drop_duplicates().sort_values('class_label')
label_to_id = dict(class_label_df.values)
id_to_label = dict(class_label_df[['class_label', 'Browser_Used']].values)

Text cleaning and preprocessing. Generating clean CSV for a better model interpretation

REPLACE_BY_SPACE_RE = re.compile('[/(){}\[\]\|@,;]')
BAD_SYMBOLS_RE = re.compile('[^0-9a-z #+_]')
STOPWORDS = set(stopwords.words('english'))def clean_text(text):
    """
        text: a string
        
        return: modified initial string
    """
   # text = BeautifulSoup(text, "lxml").text # HTML decoding. BeautifulSoup's text attribute will return a string stripped of any HTML tags and metadata.
    text = text.lower() # lowercase text
    text = REPLACE_BY_SPACE_RE.sub(' ', text) # replace REPLACE_BY_SPACE_RE symbols by space in text. substitute the matched string in REPLACE_BY_SPACE_RE with space.
    text = BAD_SYMBOLS_RE.sub('', text) # remove symbols which are in BAD_SYMBOLS_RE from text. substitute the matched string in BAD_SYMBOLS_RE with nothing. 
    text = ' '.join(word for word in text.split() if word not in STOPWORDS) # remove stopwors from text
    return text
    
df['Description'] = df['Description'].apply(clean_text)
df['class_label'].value_counts()

Interpreting text predictions with LIME

To learn more, see LIME tutorials.

c = make_pipeline(vectorizer, logreg)
class_names=list(df.Browser_Used.unique())
explainer = LimeTextExplainer(class_names=class_names)

The objective is not to produce higher results, but better analysis.

print ('Explanation for class %s' % class_names[0])
print ('\n'.join(map(str, exp.as_list(label=1))))

It gives us Internet Explorer and Firefox.

exp.show_in_notebook(text=False)

Let me try to explain this visualization:

For this document, the word “group” has the highest positive score for class Internet Explorer.
Our model predicts this document should be labeled as a group with a probability of 0.25%.
On the other hand, easy as negative for class Firefox, and our model has learned that word “private” has a small positive score for class firefox.

Interpreting text predictions with SHAP

The following process was learned from this tutorial.

from sklearn.preprocessing import MultiLabelBinarizer
import tensorflow as tf
from tensorflow.keras.preprocessing import text
import keras.backend.tensorflow_backend as K
K.set_session
import shaptags_split = [tags.split(',') for tags in df['Browser_Used'].values]
print(tags_split[:10])tag_encoder = MultiLabelBinarizer()
tags_encoded = tag_encoder.fit_transform(tags_split)
num_tags = len(tags_encoded[0])
print(df['Description'].values[0])
print(tag_encoder.classes_)
print(tags_encoded[0])train_size = int(len(df) * .8)
print('train size: %d' % train_size)
print('test size: %d' % (len(df) - train_size))y_train = tags_encoded[: train_size]
y_test = tags_encoded[train_size:]class TextPreprocessor(object):
    def __init__(self, vocab_size):
        self._vocab_size = vocab_size
        self._tokenizer = None
    def create_tokenizer(self, text_list):
        tokenizer = text.Tokenizer(num_words = self._vocab_size)
        tokenizer.fit_on_texts(text_list)
        self._tokenizer = tokenizer
    def transform_text(self, text_list):
        text_matrix = self._tokenizer.texts_to_matrix(text_list)
        return text_matrixmodel.fit(X_train, y_train, epochs = 2, batch_size=128, validation_split=0.1)
print('Eval loss/accuracy:{}'.format(model.evaluate(X_test, y_test, batch_size = 128)))

After the model is prepared, we utilize the initial 200 preparing records as our experience informational collection to incorporate over and to make a SHAP explainer object.
We get the attribution esteems for singular expectations on a subset of the test set.
Transform the index to words.
Utilize SHAP’s summary_plot technique to show the top highlights affecting model expectations.

attrib_data = X_train[:200]
explainer = shap.DeepExplainer(model, attrib_data)
num_explanations = 40
shap_vals = explainer.shap_values(X_test[:num_explanations])words = processor._tokenizer.word_index
word_lookup = list()
for i in words.keys():
  word_lookup.append(i)word_lookup = [''] + word_lookup
shap.summary_plot(shap_vals, feature_names=word_lookup, class_names=tag_encoder.classes_)

Word “hotel” is the biggest signal word used by our model, contribute most to class Edgw predictions.
Word “rooms” is the 4th biggest signal word used by our model, contributing most to class firefox of course.

There is a lot to learn in terms of machine learning interpretability with LIME & SHAP.

Hope this helps :)

Follow if you like my posts. For more help, check my Github

Connect via LinkedIn https://www.linkedin.com/in/afaf-athar-183621105/

Happy learning 😃