Two minutes NLP — Partial Dependence and ICE Plots
Explainable AI with sklearn.inspection
Hello fellow NLP enthusiasts! After writing about ways to explain predictions with LIME and SHAP, today I delve into other themes of Explainable AI, that are partial dependence plots and individual conditional expectation plots. Their names may seem convoluted but I assure you that they are easy to understand. Enjoy! 😄
Partial Dependence Plots
Partial dependence plots (PDP) and individual conditional expectation (ICE) plots can be used to visualize and analyze the interaction between the target response and a set of input features of interest.
Partial dependence plots show the dependence between the target function (i.e. our machine learning model) and a set of features of interest, marginalizing over the values of all other features (a.k.a. the complement features). They are computed applying the model to a set of data, varying the values of the features of interest while keeping fixed the values of the complement features, and analyzing the model outputs.
Individual Conditional Expectation Plots
While the PDPs are good at showing the average effect of the target features, they can obscure relationships created by interactions between features that manifest themselves only on some samples.
Similar to PDPs, an individual conditional expectation (ICE) plot shows the dependence between the target function and a feature of interest. However, unlike partial dependence plots, which show the average effect of the features of interest, ICE plots visualize the dependence of the prediction on a feature for each sample separately, with one line per sample.
Let’s see how to make these plots with Python.
Code example
First, we import the necessary libraries.
In this example we use the California housing dataset, whose aim is to predict the average house price in California block groups, using features such as the median income or the number of rooms per household. We’ll then train a RandomForestRegressor to predict house prices from these features, and finally, make partial dependence plots and individual conditional expectation plots using thePartialDependenceDisplay
class from the sklearn.inspection
module.
I added an explanation of the features in the dataset in the next code snippet. In this article, we mainly deal with these features:
- AveOccup: the average number of household members in the block group.
- MedInc: the median income in the block group.
Here are some sample feature values from the dataset. The target feature, i.e. the average household price of each block group, is a float between 0 and 5, expressed in hundreds of thousands of dollars.
Let’s train a RandomForestRegressor
that learns to predict the prices from the house features.
The last step is to make plots. We make them using the shap
library, which contains several explainable AI related methods.
We then import the shap
library and create a small subset of the training data, in this case with 100 samples.
Next, we make the partial dependence plots using the partial_dependence
function of the shap.plots
module, and passing as arguments:
- The feature of interest (AveOccup).
- The prediction function (
model.predict
). - A dataset (
X100
). - Whether to make a partial dependence plot or an individual conditional expectation plot (with
ice
). - Whether to plot also the average model prediciton (
model_expected_value
) and the average feature value (feature_expected_value
).
This function iterates over all the samples in X100
and, for each sample, calls the model.predict
function many times with different values of the target feature but keeping fixed the complement features (i.e. all the other features). The resulting plot shows the average output of the model for each target feature value, over the whole dataset.
In this plot, we see that the expected model prediction is high when AveOccup is below 2, then it quickly decreases until AveOccup is 4, and remains basically constant for higher AveOccup.
Let’s do the same for the MedInc feature.
It looks like the average predicted household price increases as the median income increases.
Let’s now try individual conditional expectation plots. We can make them using the partial_dependence
function again, but this time with the ice
parameter set to True
.
The result still shows the average model prediction over variations of the AveOccup features, which is the darker blue line. However, ICE plots also show the output variations for each sample, which allows us to see if there are samples with different feature interactions.
For example, at the top of the chart, we can see that there are block groups for which the model predicts a high price which doesn’t change much with variations of the AveOccup feature. These samples would be interesting to further investigate.
Similarly, let’s compute the ICE plot for the MedInc feature.
Again, the model follows the rule that a higher MedInc indicates a higher price for the majority of the samples, but still there are some exceptions that would be interesting to investigate.
Let’s now put away our model and analyze the training data to find relations between AveOccup, MedInc, and the block group price. We create a scatter chart where x is AveOccup, y is MedInc, and each sample color represents the block group price.
Can you guess what the chart looks like?
Indeed, samples with low AveOccup and high MedInc seem to have a higher price, which is what we saw that the model learned, thanks to partial dependence and individual conditional expectation plots. It looks like the model has learned meaningful rules 🙂
Conclusions and next steps
In this article, we saw what partial dependence plots (PDP) and individual conditional expectation (ICE) plots are, and how to make them in Python with a regression example on the California housing dataset.
Possible next steps are: