Explainable AI (XAI)
Model Interpretability techniques to explain “Black Box” Models
Series: Interpretable Machine Learning
This article is based of implementation of different Model agnostic methods explained by Christopher Molnar in his book “Interpretable Machine Learning”. If you like to study in depth you can find the e-book at: https://christophm.github.io/interpretable-ml-book/index.html
Note: This article is Part 2 of series : Interpretable Machine Learning
To understand why interpretability is important and why one uses a complex model, refer
Part 1 — Model Complexity, Accuracy and Interpretability: https://medium.com/@sajee.a/model-complexity-accuracy-and-interpretability-59888e69ab3d
Contents
- Introduction
- Need for ML Interpretability
- Why Model Agnostic methods?
- Model Agnostic interpretable methods
- Conclusion
Introduction
Today, Machine learning has bigger impact in our day-to-day lives, we need to know how they work internally to trust their predictions. One of the biggest challenges of using Machine learning is its “Black Box” nature which is because of their lack of explanation for their predictions.
Need for ML Interpretability
Its amazing to see how over a past few years, Machine Learning and AI has taken complete control over the decisions humans make in their lives. From medical diagnostics to legal decisions to companies taking business decisions based on ML models, it is a huge risk if we have no idea on “Why?” and “How?” we get a certain prediction. When we interpret a model, we can account for Fairness, Accountability and Transparency in the model’s predictions which can help build this Trust in them.
Interpretability is the degree to which a human can understand the cause of a decision. The higher the interpretability, the better it is to comprehend the decisions or predictions the model has made.
Christopher Molnar in his book “Interpretable Machine Learning” has described about interpretable models and non-interpretable models. For non-interpretable models, there are different Model Agnostic methods that can be used to interpret their decisions.
To interpret models we basically need to know:
- Feature Importance
- Effect of a feature on a particular prediction
- Effect of each feature over large number of predictions
We will be discussing each of these in detail below.
Why Model Agnostic Methods
When you think of a “Black Box” you do not know whats inside it i.e. we wont understand the working inside it. It is easier to work with model-agnostic explanations because the same method can be used for any type of model. Alternative to model-agnostic interpretation methods is to use only interpretable models.
Advantages of Model agnostic methods:
- Model flexibility
- Explanation flexibility
- Representation flexibility
Model Agnostic Interpretable Methods:
Feature Importance:
- Permutation Feature Importance
- Feature Interaction
Causal Interpretation:
- Partial Dependence Plots (PDP)
- Individual Conditional Expectation (ICE)
Surrogate Models:
- Global Surrogate
- Local Surrogate — LIME
Explain Predictions:
- Scoped Rules
- SHAP values
In this article, we will be using these model agnostic methods to interpret the results of Gradient Boosting Regressor model we created in the last article.
Feature Importance:
When business asks questions like “Why did our customers churn out?” or “What leads to more customer retention?”, it is important to understand the features that affect these prediction.
Permutation Feature Importance:
Permutation feature importance measures the increase in the prediction error of the model after we permuted the feature’s values, which breaks the relationship between the feature and the true outcome.
So if a feature is “important”, shuffling its values increases the model error.
Python Implementation — ELI5
Higher weight indicates higher feature importance.
Features like hour, working day, temp, humidity are important and hence changing value drastically changes the model predictions.
Feature Interaction:
Feature Interaction measures the variance in Partial dependence function of one feature i.e. how does permutation in one feature value affects the other feature. If the variance is high, then the features interact with each other, if it is zero, they do not interact.
Lets look into Correlation matrix to find out features that are highly correlated and then look into their feature interaction.
Features humidity — windspeed and temperature — humidity have negative correlation. Lets look into their feature interaction now.
Theory: Friedman’s H-statistic:
According to Friedman’s H-statistic we deal with 2 cases:
- 2 way interaction measure that tells us to whether and to what extend the two features interact with each other
- Total interaction measure to tell us whether and to what extend a feature interacts with all other features in a model
Python Implementation — H-statistics
H statistic of the variables represented by the elements of array_or_frame and specified by indices_or_columns can be computed. The larger H, the stronger the evidence for an interaction among the variables. H varies from 0 to 1.
{('temp', 'atemp'): 0.15373552315496558,
('temp', 'humidity'): 0.09849995273815548,
('temp', 'windspeed'): 0.5574920397015759,
('atemp', 'humidity'): nan,
('atemp', 'windspeed'): nan,
('humidity', 'windspeed'): 0.4392361526105014}
Interaction between Temp & Windspeed and Humidity & Windspeed are really high.
Causal Interpretations:
Causal Interpretation of “Black Box” model shows that for features that contribute to the models predictions, how changes in their input results in model’s behavior change. To learn more about Causal interpretation of Black box model research, you can read more here.
How to get to causality:
- Check for features that have high impact on model
- Measure the importance based on the contribution to accuracy
- Check for causality between the features and target
Friedman’s Partial Dependence Plots (PDP):
Partial Dependence Plots are one level drill down of feature importance. While feature importance shows what variables most affect predictions, partial dependence plots show how a feature affects predictions. Once we know what features are importance, we need to know how changing this features value can affect the model’s prediction i.e. the causal relationship between the feature and the prediction.
By keeping other features fixed, we can find out the causal interpretation between feature input and models prediction.
Python Implementation — PDPbox
Most of the bike rides happened when the temperature was warm but not too hot.
Hotter temperature, more bike rides. As temp increases above 20 degree celsius the number of bike rides increases and then reduces after it reaches around 30 degree celsius.
Bike rides increases when humidity exceeds 60%
The average behavior of PDPs can be misleading in the presence of strong interactions or for highly nonlinear response functions. This is when ICE plots help us to get better insight on the relationship.
Individual Conditional Expectations(ICE):
ICE plots disaggregates the PDP function to reveal interactions in individual differences. An ICE plot visualizes the dependence of the prediction on a feature for each instance separately, resulting in one line per instance. In case the feature having an interaction with any other feature, ICE plot is able to capture it better than PDP.
Python Implementation — PyCEbox
Most of the bike riders prefer to ride when temperature is above 20 degree celsius.
Surrogate Models:
Surrogate models are basically a simplified model that is trained to approximate the “Black Box” model under the constraint that surrogate model should be interpretable. Surrogate models can be either global level — interpreting the model or at local level — interpreting a single prediction.
Global Surrogate:
Global surrogate model is an interpretable model that is trained to approximate the predictions of a black box model. Fitting a surrogate model requires no information about the inner workings of the black box model, only the relation of input and predicted output is used. The choice of the base black box model type and of the surrogate model type is decoupled.
How does Surrogate model work?
- Select a dataset X — a train dataset, dataset with same distribution or subset of data
- Get the predictions of the black box model
- Select an interpretable model type (linear model, decision tree, …)
- Train the interpretable model on the dataset X and its predictions
Python Implementation — Tree Surrogate with Skater
Skater uses Tree Surrogate to explain a model’s learned decision policies. The base estimator(“Oracle”) could be any form of a supervised learning predictive model — in our case Gradient Boosting Regressor model.
#build a surrogate model
from skater.core.explanations import Interpretation
interpreter = Interpretation(training_data=X_train,
feature_names=X_train.columns)
from skater.model import InMemoryModel
model = InMemoryModel(GBR_model.predict, examples = X_train)
surrogate_explainer = interpreter.tree_surrogate(oracle=model, seed=5)# explainer fit
mae = surrogate_explainer.fit(X_train, Y_train, use_oracle=True, prune='post', scorer_type='mae')#Mean absolute error
mae = 22.945
The ouput of the implementation generates a fidelity score to quantify tree based surrogate model’s approximation to the Oracle. Given that MAE approximately is 23, the surrogate models is a close approximate to the Gradient Boosting Regressor model.
Local Surrogate (LIME)
Local Interpretable Model-agnostic explanations:
LIME provides local model interpretability i.e. it focuses on training local surrogate models to explain individual predictions. LIME tests what happens to the predictions when you give variations of your data into the machine learning model.
How does LIME work?
- Select a local instance for which you want prediction
- Peturb dataset to get new data points
- Weight the new samples according to their proximity to the instance
- Train a weighted, interpretable model on the dataset with the variations
- Explain the prediction by interpreting the local model
Python Library — lime
Predicting the interpretation of an instance.
Intercept 263.5156364850176
Prediction_local [-4.57966725]
Right: 2.126336579439559
Explain Predictions:
Getting explanations for individual predictions helps significantly to understand how model works.
Scoped Rules:
Anchors was published by the same group that worked on LIME after they found limitations of LIME in terms of it is not being clear whether a given explanation applies in a region where an unseen instance is located.
To learn in depth about Anchor, refer here — https://homes.cs.washington.edu/~marcotcr/aaai18.pdf
Anchors provide high-precision explains for individual predictions of any black-box classification model by finding a decision rule that “anchors” the prediction sufficiently. It basically uses a perturbation-based strategy to generate local explanations for predictions. So, instead of building a surrogate model, it uses easy-to-understand IF-THEN Rules — called Anchors.
How are LIME and Scoped Rules different?
LIME solely learns a linear decision boundary that best approximates the model given a perturbation space while Anchors approach constructs explanations whose coverage is adapted to the model’s behavior and clearly express their boundaries. Anchor method creates perturbations and evaluates for every instance that is being explained.
Python Implementation — Anchor
Anchor: year <= 0.00 AND weekday_6 <= 0.00 AND weekday_0 <= 0.00 AND 0.34 < temp <= 0.50 AND 6.00 < hr <= 12.00 AND season_Fall <= 0.00 AND month <= 4.00 AND humidity > 0.78 AND weekday_3 <= 0.00 AND weather_4 <= 0.00 AND total_count > 282.00 AND weather_2 <= 0.00 AND weekday_4 <= 0.00 AND weather_3 <= 0.00 AND 0.00 < season_Summer <= 1.00 AND season_Winter <= 0.00 AND windspeed <= 0.10 AND 0.33 < atemp <= 0.48 AND weekday_5 > 0.00 AND 0.00 < workingday <= 1.00 AND holiday <= 1.00 AND weekday_2 <= 0.00 AND weekday_1 <= 0.00 AND 0.00 < weather_1 <= 1.00 AND season_Spring <= 0.00Precision: 0.99
Coverage: 0.28
SHAP values:
SHAP stands for SHapley Additive exPlanations. Its a kernel based estimator for approach for Shapley values inspired by local surrogate models. SHAP explanation method computes Shapley values from coalitional game theory.
SHAP can be used for both Global and Local Interpretations.
Python Library — shap
Local Interpretations:
For local interpretability, the goal of SHAP is to explain the prediction of an instance x by computing the contribution of each feature to the prediction.
Actual Prediction — 149 | Average Prediction — 164.67
Local Interpretability shows contribution each feature has towards the individual prediction. Year and Working day has the most positive effect while hr has negative effect.
Global Interpretations:
To get an overview of which features are most important for a model we can plot the SHAP values of every feature for every sample.
Hr has the highest effect on the prediction followed by working day, temp and humidity.
Conclusion:
After walking through the different model agnostic methods to interpret our Gradient Boosting Regressor model, we get explanations about the predictions of the model as follows —
- Hr, temperature, humidity, working day are most important features for our regression problem.
- Temperature, humidity and windspeed features have high interaction among them
- Bike rides increases with increase in temperature, increase in humidity and decreases with windspeed
- Interpreting individual predictions results in understanding which features contributed positively or negatively to its prediction
- Building a surrogate model to completely interpret Black box model
This is a great way to interpret any models internal working which helps us to build that “Trust” the model.
Explore H20.ai MLI Capability:
For most of the model agnostic methods discussed above, H2O Driverless AI provides robust interpretability of machine learning models to explain modeling results with their Machine Learning Interpretability (MLI) capability.
To learn more about H20.ai MLI capability —
https://www.h2o.ai/products-dai-mli/
About Me
Experienced Data Analyst with strong analytical skills and deep interest in Machine Learning. You can connect with me through Medium Blogs, LinkedIn or Tableau.