Explain sentiment prediction with LIME

Ian SHEN
Just another data scientist
4 min readMar 1, 2019

Introduction

Why my predictive models make incorrect predictions? As a data scientist, you might ask yourself this question many times. Why your models label A as negative? Domain experts might grill us with similar questions. To answer such questions, you need to understand how your machine learning models work. If you have explainable models like decision tree, model interpretation is straightforward. However, unlocking black-box models is not easy.

Since black-box models have demonstrated their cutting-edge performance in solving today’s problems, understanding how the complex models make predictions becomes more and more important, especially in high-stakes domains such as healthcare. For instance, healthcare professionals are required to explain ML predictions to their patients under the new EU privacy law.

There are a number of methods and techniques for interpreting models as explained by Lars Hulstaert. In this post, we mainly demonstrate how to explain machine learning models prediction with LIME in a sentiment analysis case. Specifically, we will explore why some movie reviews are labeled as positive while others negative. All our executable codes are available on Colab.

https://colab.research.google.com/drive/18dZ6ZTaTfFKiMeAkWHjJx-Fi13abdJKU#scrollTo=PJMz-VVnRehl

Basic knowledge about LIME (short for local interpretable model-agnostic explanations) is not covered in this blog, and it is available here.

The LIME package

A python package was developed to support explaining individual prediction for text classification models or classifiers that built on numerical or categorical data or images. LIME is on pypi, and can be installed using pip:

pip install lime

Lime integrates well with scikit-learn. It supports any built-in model in scikit-learn. Documentations and tutorials are available here.

IMDB review sentiment analysis

Sentiment analysis is a common text mining task among data scientists. It usually classifies textual data into two classes: positive and negative. Here we analyze movie reviews from IMDB and try to build classifiers that label movie reviews as positive or negative. At first, two common predictive models are built. Then the LIME is used to explain sentiment predictions.

Data

The dataset contains 50k labeled movie reviews stored as raw text, which is equally split into training and validation set. The training set has 12.5k positive and 12.5k negative reviews. The validation set is the same. Andrew Maas compiled this dataset and put it online.

For this post, we have merged and preprocessed the dataset. The training and validation set are stored as CSV files which can be easily loaded as data frames.

Overview of the dataset

Victorization

We use CountVectorizer to convert each review into a matrix of token counts. Moreover, stopwords are removed.

cv = CountVectorizer(binary=True, stop_words='english')
cv.fit(reviews_train.review)
X = cv.transform(reviews_train.review)
X_test = cv.transform(reviews_test.review)

Build a predictive model

We selected random forest. It is a popular predictive model that often delivers good performance on classification tasks.

# build a random forest
X_train, X_val, y_train, y_val = train_test_split(
X, reviews_train.label, train_size = 0.8
)
rf = RandomForestClassifier(n_estimators=500)
rf.fit(X_train, y_train)
print ("Accuracy: %s"
% accuracy_score(y_val, rf.predict(X_val)))

Explain prediction with LIME

Understanding why reviews are wrongly classified

Positive review was classified as ‘negative’
original review

As we look at the original text of the wrongly classified movie review, it shows that negation is the problem. ‘Boring’ is the decisive feature that determines the sentiment label. However, the reviewer meant that ‘there is barely a boring moment’. To improve the performance of our predictive model, we need to take negation into consideration.

Explaining individual predictions

Explaining why a review is labeled as negative

The figure shows that the review is negative because it contains words like worst, waste, bad, awful, mess. Such words are all negative words. Therefore, it is not surprising that the review is classified as ‘negative’.

Explaining why a review is labeled as positive

On the other hand, a review is predicted as ‘positive’ because it has positive words such as best, great, perfect, fantastic.

Conclusion

LIME is a useful tool that helps data scientists figure out why their predictive models fail and explain individual predictions. Moreover, it is easy to integrate with models built on common machine learning packages. Since LIME is model-agnostic, we will explore using it to interpret other models in our future posts.

If you have any questions on the post or model interpretability in general, I’ll be happy to read them in the comments.

--

--