Explain Your ML Model Predictions With Local Interpretable Model-Agnostic Explanations (LIME).

Daria Nguyen
Publicis Sapient France
14 min readMar 30, 2020

Being able to explain and interpret your Machine Learning (ML) model predictions is really important. It helps to understand the key impactful features for predictions and analyze them. Thus it lets us achieve higher levels of trust, meaning that business users can get interpretable explanations and stay confident about outputs of the model.

This article contains several sections:

  1. General concepts of explainability and interpretability of ML models.
  2. What is LIME and how does it work? — a detailed explanation of the LIME framework in case you need a solution for delivering local and model-agnostic explanations.
  3. A couple of illustrative examples of LIME delivering explanations for text and image classifications together with a Python code, aiming to simplify your first experience with LIME.

1. General concepts of explainability and interpretability of ML model predictions.

Let’s start with a few key definitions:

  • An interpretation is the mapping of an abstract concept (e. g. a predicted class) into a domain that the human can make sense of. For example, the interpretable domains include images (arrays of pixels), texts (sequences of words). A human can take a look at the image or read a text and understand=interpret the content. However, there could be also NON-interpretable domains such as abstract vector spaces (embeddings), undocumented input features, aka unknown or abstract words or symbols.
  • An explanation is the collection of features of the interpretable domain, that have contributed to a given example for producing a decision (e. g. classification or regression). An explanation can be, for example, a heatmap highlighting which pixels of the input image most strongly support the classification decision. It can also be computed at a finer grain, for example, not just the image zones, but also the color components. In natural language processing, explanations can take the form of highlighted words in the text, etc.

Why do we care about interpretations and explanations?

Despite the fact that many machine learning models are black boxes, understanding the rationale behind the model’s predictions would certainly help users decide whether to trust or not to trust the model.

Figure 1. The model as a black box.

Imagine that we have a model that assists doctors to diagnose whether a patient is sick, using data from a pre-filled questionnaire. The model predicts that most probably a certain patient is sick. The decision about the final diagnosis and treatment still needs to be taken by a human. So, the “explainer” can help a lot, telling a doctor why the model concluded that the patient is sick highlighting the symptoms specific for this particular patient that impacted most significantly the model’s prediction, for example:

Figure 2. Model explanations can help a human understand the rationale behind the model’s prediction

With this information about the rationale behind the prediction, the domain expert now has a way to assess the validity of the model’s decision.

What vs. why?

Usually, when developing and employing an ML model the objective is to answer the question “What?“ : What is the probability of an earthquake? What will be the optimal price of our product? What is the class of a tumor? But then, like in the example above, the user of the model wants to be able to answer the question “Why?“. Here we face a dilemma and have to acknowledge that modern learning algorithms show a tradeoff between explainability and accuracy. In this context, we can say that deep learning models are potentially the most accurate and the least interpretable. But, can we do something about it? Can we keep using models with the best achievable performance, but in the meantime, find a way to explain clearly their predictions?

Figure 3. ML Models prediction accuracy vs. explainability according to DARPA study.

Source: DARPA XAI Industry Presentation

How can we explain our model results?

There are a number of ways the predictions of an ML model can be explained. All solutions for the explanation of ML models’ results can be largely distributed along two dimensions: between local and global explanations and between model-specific and completely model-agnostic explanations. Indeed, we can explain the model decision for all the population and that makes our explanation global, but we can also try to explain a prediction the model makes for a single specific instance and thus it becomes very local. It is important to be able to give explanations at the local level because once a model diagnosed that based on your symptoms you are sick and need to be operated urgently with probability 0.8, you obviously want to know why that decision was taken in your personal specific case and whether a doctor got enough proof from the model to send you to the operation theatre. Also, we can stay happy with one single model and all we need then is to find a way to explain predictions delivered by this specific algorithm (or family of similar algorithms). On the other hand, we may want to be able to explain the prediction of pretty much any ML model, applying a single tool or technique. The practice shows, that in many cases businesses demand an explanation for a single instance and it is highly desirable that the explanation stays model-agnostic.

Figure 4. Linear regression model for prediction of daily temperatures.

Linear models explainability seems to be pretty straightforward. From the mathematical standpoint, the prediction is a linear combination of the values of the predictors, weighted by the model coefficients. Clearly, the higher the absolute weight the more significant the impact of the corresponding predictor. But the linear models typically demonstrate a rather low performance when we have to deal with complex and non-linear dependencies.

In 2014 in his Ph.D. thesis, Gilles Louppe analyzed and discussed the interpretability of a fitted random forest model in the eyes of variable importance measures. Now we can use it with.feature_importances_ attribute of scikit-learn RandomForest() estimator. And it lets us understand which feature-split in the ensemble of the decision trees had the most significant impact on the final model prediction.

Figure 5. Feature importance of a RandomForest model.

We can clearly observe the impact of each feature on the model prediction considering all studied instances from the train set. It is cool, but not really enough because we cannot explain the model decision at a very local level when we talk about a single flower instance. So, what can we do here?

In the case of RandomForest, a feature contribution method helps us to achieve some good results — available since scikit-learn 0.17 the ._tree attribute of a RandomForest estimator, allows us to store and also access values of all nodes instead of just leaves of the decision trees. In order to access features contribution for each instance, it is possible to use Treeinterpreter framework which makes it rather simple.

prediction=bias + feature 1 contribution+…+feature n contribution

Figure 6. Contribution of RandomForest model features to the predicted class of a single instance.

Now, when we can access the contribution of the features towards the classification of a single iris flower we can see that that the strongest contributors for predicting the second class for this instance were the petal width and the petal length (see the red frame).

With this, we have a way to discover the feature’s contribution to a prediction for an individual instance. That is not bad! Isn’t it? But unfortunately, this explanation can be extracted only in case we are using the RandomForest model or similar algorithms from the trees-based models family.

Is there anything better then? Can we get the same thing while keeping the possibility to use any model to solve our prediction problem? Thus we are looking for an explanatory framework that is not only local but also model-agnostic.

And we got a Local Interpretable Model-Agnostic Explanations (LIME) framework to help us!

2. What is LIME and how does it work?

The idea behind LIME is rather simple and elegant: for each individual instance that we pass to the model and for each prediction it makes we can always perform a “local sensitivity analysis“ in order to understand how sensitive is the prediction with regards to each feature of this particular instance. The beauty of LIME is also that it stays model-agnostic for classical ML tasks, such as classification and regression.

So, how does it work? LIME modifies our specific data sample by tweaking slightly the feature values and collects the resulting impact of each individual feature change to the prediction=output. And this helps us answer the question of the model’s users — what exactly caused the prediction to be like this?

Let’s look at an abstract theoretical example (Fig. 7):

Figure 7. Illustration of a local explanation methodology employed by LIME.

The original decision function is represented by the blue/orange background and is clearly nonlinear. The largest red “X” is the instance we want to be explained. We simply perturb instances around X and weight them according to their proximity to X. The weights here are represented by the sizes of the symbols — blue circles and red Xs. We get the original model’s prediction on these perturbed instances, and then learn a linear model (black line) that approximates the model well in the vicinity of X. Please note that the explanation, in this case, is not true globally, but it is true locally around our instance X.

The explanation produced by LIME at a local point x is obtained by the following generic formula:

ξ(x) = argmin L(f,g,πx) + Ω(g) g∈G,

where f — is our real function (aka ground truth), g — is a surrogate function we use to approximate f in the proximity of x and πx defines the locality. This formulation can be used with different explanation families G, fidelity functions L, and complexity measures Ω. Here we assume complexity is opposed to explainability. Typically, g would belong to the family of linear functions (low complexity), but, of course, it is not always possible and we should also consider using non-linear functions here. In this last case, the complexity Ω(g) becomes higher and explainability decreases. The choice of a complexity measure is not a trivial one. In some cases, it can be a degree of a polynomial function, but also, one may choose a computation time for the original function f to measure the complexity. The loss function L lets us minimize the local mismatch between the original complex function f and approximating function g, thus often it is a well-familiar loss function such as RMSE.

Now let’s talk about the interpretation of the explanations, acquired with LIME using a simple practical example. Once we look at a sample of instances in a certain proximity of our individual instance x, we build a linear regression model, applicable for this small sample. A simplified example below (Fig. 8) illustrates the probability of cancer once using a person’s age as a predictor.

Figure 8. Example of LIME explanations construction for a single feature case.

Looking at the example above we can observe, that if we consider age to be the only independent variable impacting the probability of cancer, that clearly shows us that around the age of 20 the explaining linear model is rather “flat“ in comparison to the explanation for the case of a sample around the age of 60, where the probability of cancer varies much more significantly between 55 and 65 years, depending on the age. That helps us explain the local impact of age on the model function (probability of cancer).

Of course, some questions are still remaining: how to choose the number of neighbors to be analyzed or the vicinity limit in terms of the distance, etc. But in many cases, it’s just an empirical knowledge and should be mentioned here as one of the main weaknesses of LIME. In fact, our explanations will obviously depend on the chosen values for the mentioned parameters of a LIME estimator. We can expect, however, that with increasing application of LIME some “rules of thumb“ may arrive for specific categories of applications.

Let’s look at practical and a bit more complex examples, using several public datasets.

3. Illustrative examples of LIME delivering explanations for text and image classification.

3.1 Text classification for the 20-NewsGroups dataset.

We all love this data set because it allows students who study Data Science to sharpen their skills and put to practice their theoretical knowledge. (If you do not believe me, simply try to perform cluster analysis detecting the topic clusters).

To make the illustration of the LIME application more understandable, I leave only two classes from the 20 and train a simple classifier (SVM) which is going to tell us a class of each news post.

Here is a code that vectorizes the texts, trains and evaluates the default SVM classifier, which demonstrates pretty impressive performance: F1 score at 0.93! Nothing to say! But things turn more interesting when we try to explain the prediction…

accuracy = 0.91, precision = 0.87, recall = 0.99, f1 = 0.93

Now let’s try to get explanations. In order to do so, we will create a pipeline, packaging vectorization and classification steps inside and create a LimeTextExplainer to explain a news post with index 13 in the test set. Let’s first take a look at the content and also check the probabilities assigned by our classifier to each of the news classes.

True class: alt.atheism
Predicted class: alt.atheism
[[0.985 0.015]]

The classifier predicts that our text belongs to the alt.atheism topic and it is very confident about this prediction — as you can see the probability for this class is 0.985. And still, there is nothing to worry about, except, what if we ask LIME to explain to us why the classifier assigned the class alt.atheism to this text, so we need to instantiate an explanation using explain_instance method.

Explanation for class alt.atheism
sgi, 0.091
edu, 0.067
rice, 0.053
wpd, 0.048
com, 0.047

Explanation for class soc.religion.christian
sgi, -0.091
edu, -0.067
rice, -0.053
wpd, -0.048
com, -0.047

Do you see anything strange? Indeed, the classifier took into account and relied heavily upon the highlighted pieces of text, which in human understanding are not really meaningful!!! If a person had to make a decision about the topic of this content, the highlighted text elements would never be taken into account…So, can we trust our classifier? Shall we probably take a closer look now at our original data?

After receiving an explanation from LIME, a more detailed analysis should be performed. It allows us to uncover that the majority of the news posts, belonging to the alt.aheism class, contained technical headers of the messages with the specific email addresses and NNTP-posting-host detail, which probably is a consequence of a way the data set was created.

In my opinion, this is an awesome example to explain why interpreting and understanding model results might be important… even all the formal performance metrics of our classifier look very solid and cool. To conclude here, it is necessary to mention that once better text pre-processing has been applied and all meaningless components, such as parts of email addresses, have been removed, the situation changes: the estimated performance of the classifier drops (accuracy = 0.85, precision = 0.81, recall = 0.94, f1 = 0.87), but now the model is focused on the content of the news body and not the technical parts of the header, which obviously makes much more sense and delivers real value for the end-users of our model.

3.2 Images classification with DNN and Keras? Can LIME tell us something here?

Now that we have demonstrated how to apply LIME to the text and interpret results of a classical ML classifier like SVM, let’s now try to explain and interpret decisions delivered by Deep Learning-based algorithms.

For illustration purposes, we use here the InceptionV3 model from Keras pre-trained on ImageNet data set. For our experiment, we will use the image of an owl head (Fig. 9) from a set of confusing images and a custard apple (Fig. 11).

The code below instantiates the model with all the weights and also creates a transformation function that lets us transform any input to the required input size for this model equal to 299x299.

Figure 9. Image of an owl head.

Here are the model’s predictions: (‘n07760859’, ‘custard_apple’, 0.20883653)

As you can see, our model predicts, that this is a custard apple. In fact, it is an owl’s head. But can we check why the model might confuse an owl with an apple? Let’s see how LIME explains this prediction using LimeImageExplainer.

Here is what our model sees in the photo. The green regions, as well as equivalent segments of the original images, show us the areas which support the decision in favor of fruit.

Figure 10. Interpretable explanations from LIME for an owl head classification.

Now let’s look at an image of a real custard apple and check if our model might have good reasons to be confused and give a wrong prediction.

Figure 11. Image of a custard apple.

The model has detected the real custard apple here exceptionally well!

(‘n07760859’, ‘custard_apple’, 0.99996376)

But, let’s also look at the explanations here and see if this one resembles the explanations for the first image.

Figure 12. Interpretable explanations from LIME for a custard apple classification.

Looking at the LIME explanations for both images, we can say, that despite the fact that the model made a mistake, this wrong answer was reasonable because we understand the logic behind: the images indeed were confusing. We can observe and detect the elements of the images which make them look similar and which cause misclassification. Now with this knowledge, we can try to uptrain our model, feeding it enough relevant examples for finally achieve good classification results for such cases.

I hope you do agree now that LIME is a powerful and easy-to-use tool for the cases when you really need to get a local and model-agnostic explanation of your model results. However, keep in mind that each tool has its limitations and in the case of LIME, it is its sensitivity to the hyper-parameters values of the explainers. Also, LIME has particular inability to deal with the high dimensionality of the data: by itself, given a large number of predictors LIME can not discriminate between relevant and irrelevant features and also at the presence of a large number of dimensions the distances between certain close and remote data points turn undifferentiable. Thus, it implies a need to do some preliminary work on dimensionality reduction and feature selection before applying LIME.

As the next step to consider, you may want to move at a higher level of generalization and understand better the underlying forces and drivers of some effects, it is time to refer to SHAP (SHapley Additive exPlanations) which is one of the most advanced modern tools for this purpose.

NOTE: The code snippets were largely generated using examples from the various LIME tutorials available in the LIME GitHub repository.

--

--

Daria Nguyen
Publicis Sapient France

Data Scientist/Data Engineer at Publicis Sapient Engineering.