Interpretable Machine Learning for Local Interpretation Methods

AC
Data Folks Indonesia
3 min readMar 5, 2023
Photo by Isaac Smith on Unsplash

In this article, we will walk through some of the most popular methods to interpret machine learning models. These methods are model-agnostic. But I can say that most of these methods are also rely heavily on interpretable algorithms such as regression, tree-based model, or probabilistic model.

This article is highly influenced by Interpretable Machine Learning, Christoph Molnar.

Local interpretation methods are basically individual predictions compare to global interpretation that explain the general behavior of the model.

Individual Conditional Expectations (ICE)

Individual Conditional Expectations (ICE) is similar to Partial Dependency Plot (PDP) which helps to visualize the average partial relationship between the selected feature and the prediction. The problem arises when an average curve such as PDP reduce the complexity of the modeled relationship while the partial relationship is varied.

In particular, the ICE plot highlights the variability of the fitted values ​​across all the samples, suggesting where and how much heterogeneity exists. Hence, you can observe and explain a single sample by changing a single feature related to the predicted target.

Local Interpretable Model-agnostic Explanations (LIME)

Using the LIME method, any black box machine learning model can be approximated with a local, interpretable model to account for each prediction. When different versions of your data are fed into the model, LIME examines what happens to the forecasts. In addition, LIME supports tabular, text, and image data.

LIME generates a new dataset made up of perturbed samples and the associated black box model predictions. LIME develops an interpretable model (regression and its variants, tree-based models, etc.) using a perturbed dataset that is weighted according to how close the sampled examples are to the instance of interest. The trained model should be a good local estimate of the predictions made by the machine learning model, but it is not required to be a good global approximation. Therefore, LIME has trade-off between complexity and fidelity.

Anchors

Anchors method is a model-agnostic system that explains the behavior of the model by finding a decision rule with high precision. The method was proposed in a paper titled Anchors: High-Precision Model-Agnostic Explanations. Similarly with LIME, the method creates perturbed samples to generate local explanations for the target prediction. The output of the method is easier to understand due to IF-THEN rules which we called anchors.

Shapley Values

Shapley values is one of the popular methods to explain the output of the machine learning model. The method is derived from game theory that used to determine the contribution of each player in a game. The method works on assigning payouts to players depending on their contribution to the total payout. The Shapley values is the average marginal contribution of feature value in a prediction from all possible combination of the features.

Shapley Additive Explanations (SHAP)

Similarly with Shapley Values, SHAP is a method for local interpretation which allows the calculation of Shapley values with much fewer coalition samples. The method is introduced in this paper which proposed KernelSHAP and TreeSHAP. KernelSHAP is based on linear regression where the coeffs are the Shapley values.

Conclusion

Finally, those are the common local methods to interpret machine learning model. Every method has its own pros and cons. So, be careful when choosing the method and explain about it.

--

--