Local Interpretable Model-Agnostic Explanations (LIME) — the ELI5 way

Published in

Intel Student Ambassadors

4 min readAug 14, 2019

Introduction

Machine Learning models can seem quite complex when trying to understand why a certain model predicts what it does. State-of-the-art statistical models often fail to justify their outputs, when it is quite important to establish trust if the user, say a medical professional, need to make decisions based on the output. Hence, it is imperative that we gain insights into the model, and transform its predictions from an untrustworthy to a trustworthy one.

Given an image (Original Image) how do we deduce which part of the image contributes to the overall model being able to correctly classify it as a frog? To understand this better, let’s ask ourselves (humans) how we came to the conclusion that the image is that of a frog. One of the features/section in the image which contributes immensely to our deduction is the head of the frog.

Image divided into components to test for interpretation by LIME

What if we could break down the so-called black boxes and get more visibility into how different sections/features of the input data affects the output i.e. predictions by the models?

Local Interpretability

The intuition behind Local Interpretability is to focus on training local surrogate models on perturbed inputs, instead of training them globally. Furthermore, these local surrogates need not be very specific to the global model we are using for making predictions.

Our main goal is to understand the reason behind the model making predictions by making variations to the data being fed into the black-box as input. When the data point is fed, LIME generates a new dataset consisting of permuted samples and the corresponding predictions of the black-box model. An interpretable model is trained on this new dataset, which is weighted by the proximity of the sampled instances to the instance of interest. The local interpretable model can be anything from the following:

Linear Regression
Logistic Regression
Decision Trees
Naive Bayes
K Nearest Neighbors

Furthermore, the learned model should be a good approximation of the local model predictions, and need not be a good global approximator. This form of accuracy is known as Local Fidelity. This implies that the local estimator should be faithful locally and be able to explain in that vicinity. In addition to it, the local model should be Model Agnostic, which basically means that it should treat the original model as a Black-Box, and hence be able to explain any model.

LIME is able to explain that the region involving the Head of the Frog is the most critical reason why the global model classifies the picture as that of a Frog and not any other animal or object

Training Local Surrogate Models

The first task is to select the region of interest for which we wish to find an explanation to the black-box prediction. Moving on, the datasets are perturbed and permutated to get back the Black-Box predictions for the data points in the new dataset. Weights are then assigned to new samples according to the proximity of their instance of interests. So, the relative weights can either contribute to the prediction or be evidence against it. As the next step, we train a weighted, interpretable model on the dataset with variations. Finally, we explain the predictions by interpreting the local model.

Advantages of LIME:

On replacing the underlying model from say, Random Forests to Support Vector Machines, if Decision Trees were used as Local surrogate models for the prior, the same surrogate can be used for the latter as well.
Works well with Textual, Image, and Tabular data.
Fidelity Features gives us a good idea about how reliable the interpretable model is in explaining black-box predictions in the region of interest of the data.

References:

“Why Should I Trust You?” Explaining the Predictions of Any Classifier — Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin