Explaining Machine Learning Models

Suraj Banjade
Jun 4, 2018 · 4 min read

The following is a predicted* conversation between a Data Analyst and a Customer Retention Manager

Data Analyst: I have this great model, which can predict customer churn with really high accuracy!! Yayyy

Customer Retention Manager: Awesome, Can you send me the list of all customers likely to Churn

Data Analyst: Here is the spreadsheet with the list of customers whose probability of churning is greater than 60%

Customer Retention Manager: I see that customer abc123 has 90% of churning. Why?

Data Analyst: Well, I just built a very complex Machine Learning Model. The prediction is based off 50 different factors. The model is made accurate due to a combination of complex variables.

Generally at this point the conversation stops. I feel Machine learning research has focused a lot on creating complex and accurate models. Explainability, however, has been the casualty of greater accuracy and complexity.

According to the authors of a brilliant algorithm LIME “Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model.”

Giving Customer Retention Manager the list of customers with high probability of churn is a brilliant first step. Albeit, for the manager to make a good customer retention strategy, the manager needs to know which factors are causing churn probability to go up or down. This is where this brilliant algorithm LIME comes to the rescue. The following is an excerpt from the author of the lime algorithm.

“The purpose of lime is to explain the predictions of black box classifiers. What this means is that for any given prediction and any given classifier it is able to determine a small set of features in the original data that has driven the outcome of the prediction.”

Using a publicly available IBM Watson Telco Customer Churn Data Set, I will create a model that predicts customer churn. I will then use a LIME algorithm to explain the results. The following model building is inspired from a blog post by Matt Dancho.

Variables available for predicting churn

I will use a generalized linear model as they are quick and produce accurate results. I made a glmnet model using standard caret parameters. The AUC score was 85%. The model is accurate enough to proceed to the next step.

I deployed the model on test data and here are the results for the first five cases. The model got Case No 1 wrong but the remaining are correct.

Actual Vs Predicted Results

As mentioned earlier, giving the above list to Customer Retention Manager is a good first step. However, to convert this intelligence into actionable intelligence data, a scientist should be able to explain the probabilities. Why was customer in the first row given 39% for churning compared to 4% for customer in 5th row. I will run these five cases with LIME algorithm and explain the results.

In the example below, LIME algorithm has shown the seven most important factors that contributed to the model output on whether customer will churn. The green bars mean that the feature supports the model conclusion, and the red bars contradicts.

Explanation of the model for the first five cases

In Case No 2, the model predicted that there was 54% probability that customer will churn. According to LIME algorithm, having fiber optic and streaming movies supports the model prediction that customer will churn . If we see the same pattern for multiple customers it could mean that customers who have fiber optic and like to steam movies are not happy with the product.

In Case No 4, the model predicts that probability of churn in 2%. Total Charge and two year contact support the model conclusion, whereas Fiber optic and steaming TV contradict. It is likely that this 2% chance of churn is stemming from the fact that this customer has fiber optic and likes to stream movies.

Overall, LIME algorithm is an excellent addition to machine learning toolbox. This algorithm enables data scientist to understand why the specific prediction was made and provides a framework to explain the results in a easy to understand format. LIME algorithm is a valuable conduit to combine the power of statistics and gut feeling that front-line managers have about customers.

I fully recommend data scientists to read this paper on LIME algorithm. Special thanks to Matt Dancho whose made understanding LIME algorithm very easy with his valuable blog posts.

Suraj Banjade

Written by

Chemical Engineer turned Data Analyst. I hate consuming content so I create mine.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade