Why you need to care about Explainable Machine Learning

5 min readNov 15, 2018

In our previous blog posts, we presented the idea of Machine Learning (ML) Explainability and why it’s clear that society cares about explainable artificial intelligence.

We saw that ML explainability is potentially a great support for model building and model validation. Those who build ML systems should care about interpretability: practitioners and engineers should pursue interpretability as a mean to build better models.

The intent of this article is to offer a vision of why explainability tools are important for the practice of Machine Learning and go over a few of our favorite tools.

ML systems (e.g. for classification) are designed and optimized to identify patterns in huge volumes of data. It is unbelievably easy to build a system capable to find very complex patterns between input variables and the target category. Structured (tabular) data? Throw an XGBoost at it. Unstructured data? Some deep network to the rescue!

A typical ML workflow consists of exploring data, preprocessing features, training a model, then validating the model and decide if it’s ready to be used in production. If not, go back, often engineering features for our classifier. Most of the times, model validation is based on a measure of predictive power: for instance, the area under the Receiver Operating Characteristic (ROC) curve is often quite reliable.

Fig. 1 — How model interpretation fits in the common ML workflow

However, during model building, many design decisions can slightly change the model. Not only the choice of the classifier, but countless decisions in each preprocessing step. It turns out that, given a non-trivial problem, there are countless models with high predictive power, each one telling a whole different story about the data. Some models may capture a relation that seems very predictive for a given dataset, but that your knowledge of the real world would easily spot as too specific.

This has been splendidly called the Rashomon Effect. Which of those models should we deploy in production to make critical decisions? Should we always take the model with the highest absolute AUC? How should we differentiate between good and bad design decisions?

Interpretable machine learning tools help us decide and, more broadly, do better model validation. Model validation involves more than looking at the AUC. It should include answering questions such as: how does the model output relate to the value of each feature? Do these relations match human intuition and/or domain knowledge? What features weight the most for a specific observation?

We can roughly divide interpretability into global and local analysis.

Global analysis methods will give you a general sense of the relation between a feature and the model output. For example: how does the house size influence the chance of being sold in the next three months?

Local analysis methods will help you understand a particular decision. Suppose you have a high probability of default (not paying back) for a given loan application. Usually, you want to know which features led the model to classify the application as high risk.

Global methods

For global analysis, start by using Partial Dependence Plots and Individual Conditional Expectation (ICE).

The partial dependence plot displays the probability for a certain class given different values of the feature. It is a global method: it takes into account all instances and makes a statement about the global relationship of a feature with the predicted outcome. [Credits: Interpretable Machine Learning]

A partial dependence plot gives you an idea of how the model responds to a particular feature. It can show whether the relationship between the target and a feature is linear, monotonic or more complex. For example, the plot can show a monotonically growing influence of Square Meters on the house price (that’s good). Or you can spot a weird situation when spending more money is better for your credit scoring — trust me, it happens.

Partial dependence plot is a global method because it does not focus on specific instances, but on an overall average. For binary classification, the partial dependence value for x1 = 50is the average probability of positive class if all observations of the dataset had x1equal to 50.

The equivalent of PDP for single observations is called Individual Conditional Expectation (ICE) plot. ICE plots draw one line per instance, representing how the instance’s prediction changes when the feature changes, while the other features values are fixed.

Fig. 2 — Partial Dependence Plot (the bold line) with Individual Conditional Expectations. Credits: https://github.com/SauceCat/PDPbox

Being a global average, PDP can fail to capture heterogeneous relationships that come from the interaction between features. It’s often good to equip your Partial Dependence Plot with ICE lines to gain a lot more insight.

Local methods

One of the latest and most promising approaches to local analysis are SHapley Additive exPlanations. It aims to answer the question Why did the model make that specific decision for an instance? SHAP assigns each feature an importance value for a particular prediction.

Fig. 3 — An example of SHAP values for Iris dataset. You can see how the petal length influences the classification much more than sepal length. Credits: https://github.com/slundberg/shap

Before production, you can deploy your model in a test environment and submit data from, say, a holdout test set. Computing SHAP values for the observation in that test set can represent an interesting approximation of how the features will influence the model outputs in production. In this case, we strongly recommend extracting the test set “out of time”, that is, the more recent observations being the holdout data.

Interpretable decisions from ML models is already an important demand to apply them in the real world.

In many critical ML applications, the solution has been to consider only inherently interpretable algorithms — such as linear models. Those algorithms, incapable of capturing fine-grained patterns specific to the training dataset, are going to capture only general trends. Trends that are easily interpretable and matched against domain knowledge and intuition.

Interpretable tools offer us an alternative: use a powerful algorithm, let it capture any pattern and then use your human expertise to remove undesirable ones. Among all the many possible models, choose the one that tells the right story about the data.

When you have interpretable results out of your trained model, you can exploit this interpretability. The outputs of the tools described above can constitute a brief report that a business person understands. After all, you need to explain to your boss why your model works so well. Interpretable models will empower you, your boss and all stakeholders to make better and well-supported business decisions.

Conclusion

Sometimes people say that only ML practitioners in highly regulated applications should bother about interpretability. We believe in the opposite: every ML practitioner should use interpretability as an additional tool to build better models.

Why you need to care about Explainable Machine Learning

Global methods

Local methods

Conclusion

Written by Flavio Ferrara