ResponsibleML
Published in

ResponsibleML

BASIC XAI

BASIC XAI with DALEX — Part 6: LIME method

Introduction to model exploration with code examples for R and Python.

By Anna Kozak

Welcome to the “BASIC XAI with DALEX” series.

In this post, we present the LIME (Local Interpretable Model-agnostic Explanations) method, the model agnostic method, which is the one we can use for any type of model.

Previous parts of this series are available:

So, shall we start?

First — Idea of LIME method

The LIME method was originally proposed by Marco Ribeiro in 2016. The key idea of this method is to bring the black box model locally closer using a simpler glass box model that is easier to interpret. Figure shows the idea behind LIME. The violet and light gray areas correspond to decision regions for a binary classification model. The big black dot corresponds to the instance of interest x₊. Other dots indicates the generated new data. The dashed line corresponds to a simple linear model fitted for the artificial data. It approximates the black-box model around the instance of interest. The simple linear model ,,explains’’ the local behavior of the black-box model.

The intuition of LIME method

The objective is to find a local model M that approximates a black box model f around the point of interest x₊. We can write this as

We are looking for a local model g from the class G of interpretable models for an instance x₊. The model shall be simple so we add a penalty for model complexity measured as Ω(g). The white-box model g shall approximate well the black-box model f locally, where Πx₊ denotes the neighborhood of x₊. The L stands for some goodness of fit measure. The functions f and g may work on different data. The black-box f(x): X → R works on original feature space X when the glass-box function g: X’R usually works on an interpretable feature space X’.
The algorithm may be used to find an interpretable surrogate model that selects K most important interpretable features.

Second — let’s get a model in R and Python

Let’s write some code. We are still working on the DALEX apartments data. To calculate the LIME method we use the lime package R and predict_surrogate() module in dalex (before use install lime package). Lime packages may have different implementations, so the results may not be the same.

Code to created LIME explanation in Python and R

Let’s see the LIME explanation from R package. As we can see in Figure below the highest positive influence has district_Ochota variable. The next variable with high influence is no.rooms. Other variables have a small influence on predictions. As we mentioned in the previous blog, the Ochota district is near to City Centre, so have a huge impact.

LIME explanation with lime package R

Now let’s look at the explanation LIME with the Python package. The most important impact on the prediction is the distric_Srodmiescie variable, it is a negative attribution. Additionally, the variable surface also has a negative influence on the prediction value. The other variables have no significant impact.

LIME explanation with lime package Python

In the next part, we will talk about the Ceteris Paribus profile.

If you are interested in other posts about explainable, fair, and responsible ML, follow #ResponsibleML on Medium.

In order to see more R related content visit https://www.r-bloggers.com

--

--

--

Tools for Explainable, Fair and Responsible ML.

Recommended from Medium

Sabudh Foundation Internship 4th Month

How Artificial Intelligence Can Now Detect Bone Fractures From Osteoporosis

Open Sourcing Atlas

Machine Learning (ML) Models: From Science to Industry Quality Product Engineering: Part 1 —…

Performance metrics for classification

What after Gradient Descent(GD) in Deep Learning(AI)?

Critical Analysis of NNFL

Concept of Machine Learning | Polynomial Regression | Errors: Noise, Bias, and Variance | Splitting

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Anna Kozak

Anna Kozak

Data Scientist | Data Visualization | Responsible Machine Learning

More from Medium

A Quality Start Statistic in the NFL

How not to stumble while evaluating interpretability?

Simulation-based linear mixed effect regression models with stan

How to convert between Seurat/SingleCellExperiment object and Scanpy object/AnnData using basic…