Explaining Machine Learning Models to your Client

Rafael Alencar
Neuronio
Published in
5 min readApr 22, 2019

Note: a portuguese version of this article is available at "Explicando Modelos de Machine Learning para seu Cliente"

Even for those who work daily with Machine Learning, it is hard to explain how the models we create work and how they take a decision. Although their results are very nice, it is extremely important to explain these models' methods for those who are paying for it.

Why opening the Black Box?

With all the recent advances in Machine Learning, the better our models are, the more complex they become. That said, we started creating black-box models, which sometimes we believe it works without questioning. When we talk about Explainable Artificial Intelligence (XAI), we are talking about making a model's decision more transparent and interpretable for human beings, even if it is a non-intuitive decision. Opening these black boxes, may bring lots of benefits as we will see later.

Debugging

Besides knowing why your model works, it is also good to know when it doesn't, so fixes can be done as soon as possible. When working with complex models, there are too many variables that could be causing problems, which can turn debugging into a nightmare. Knowing each layer of your model and how they operate, can help us separating them and find out how to fix it

Improve Data Collection

Lots of Machine Learning techniques focus on feature extraction and interpretation. Many of this models receive a massive amount of data, and much of this data doesn't have an explicit meaning for humans, so this features help us on comprehending a model input.

Now, let's think about a real situation. There is an insurance company that uses a model to define the price of a policy based on the customers information. Which data we should use? How do know which are the most important ones? Data collection is not always a cheap task, so, knowing the most relevant features, we can better use our resources to obtain quality data in the most efficient way.

Generate Trust and Transparency

As I said before, the way these models work is not totally intuitive for those who don't work daily with them. There are some applications that only need a good accuracy to make your client happy. However, a medic application that defines a person needs a surgery based on its exams must count with a very transparent explanation about its decision. Banks models for credit grant should also be transparent for their clients, due to legal or even ethical requirements.

How to open a Black Box?

These methods that can bring explainability for our models will normally seek for one of these two objectives: global and local explanation.

Global explanation is about interpreting how the entire model works, for all situations. It is usually used for simples models like Linear, Probabilistic, Decision Trees, and some other white box models.

On the other hand, local explanation, focus on explaining complex models' parts and specific situations. Normally we explore some aspects of their architecture and how different input affect the model's performance.

Local Explanation Techniques Examples

Visualizing Hidden Layers

Convolutional Neural Networks (CNNs) are very popular when working with images. These convolutional layers have lots of filters that will be responsible for feature extraction. If we think of each filter as a subimage, when activating a filter, the convolucional layer is telling us that it found the subimage inside the original image. The picture below shows some of these layers from a classification CNN.

Features from each layer of a CNN

Projection based Techniques

Unfortunately, our eyes can only see at most 3 dimensions. However, deep learning models can work with even thousands of dimensions. For example, when working with texts, it is common to vectorize words into a numeric N dimension vector, using Word2Vec. Using tools such as T-distributed Stochastic Neighbor Embedding (t-SNE), we can represent these multidimensional vectors using their projection in a 2-dimensional space as shown in the picture below:

Generate vocabulary from Word2Vec using t-SNE

Model-Agnostic Explanations

It would be very nice if Machine Learning models could tells us which features were important before they took a decision, fortunately, they can. For this kind of explanation, we can use Local Interpretable Model-agnostic Explanations (LIME). Model-Agnostic means that is can be used for any Machine Learning Model, due to the fact that it simply make minor changes into an input data, using other data or some noise, and check their impact in the model's accuracy. We can check a sample tool being used for a text classifier using Python lime package, that will show the score of each class, and the score of most important words for the two best scored classes.

LIME explanation for a correct prediction

In this case, it is clear that the words "invitation", "party" and "birthday" were the ones responsible for the model's decision on choosing the "party supplies" category.

LIME explanation for a incorrect prediction

Now, on this case, we can see that the word "model" was the biggest responsible for the mistake, the word "puzzle" tried to point to the right label, but it wasn't enough. At least, we can explain the model mistakes in a way humans can understand, instead of talking about numeric scores.

Summary

This article was meant to show the reasons why explaining and interpreting Machine Learning models are so important, and how some of these tools work. Learning about how those machines think is a nice way to learn about the problems we want to solve, they can see things we can't and have insights way faster than us. However, we can explain things better than them.

References

Explainable AI: The data scientists’ new challenge

Local Interpretable Model-agnostic Explanations of Bayesian Predictive Models via Kullback–Leibler Projections

Machine Learning Explainability - Kaggle

--

--