Two minutes NLP — Partial Dependence and ICE Plots

Explainable AI with sklearn.inspection

Published in

NLPlanet

5 min readMar 2, 2022

Hello fellow NLP enthusiasts! After writing about ways to explain predictions with LIME and SHAP, today I delve into other themes of Explainable AI, that are partial dependence plots and individual conditional expectation plots. Their names may seem convoluted but I assure you that they are easy to understand. Enjoy! 😄

Partial Dependence Plots

Partial dependence plots (PDP) and individual conditional expectation (ICE) plots can be used to visualize and analyze the interaction between the target response and a set of input features of interest.

Partial dependence plots show the dependence between the target function (i.e. our machine learning model) and a set of features of interest, marginalizing over the values of all other features (a.k.a. the complement features). They are computed applying the model to a set of data, varying the values of the features of interest while keeping fixed the values of the complement features, and analyzing the model outputs.

Individual Conditional Expectation Plots

While the PDPs are good at showing the average effect of the target features, they can obscure relationships created by interactions between features that manifest themselves only on some samples.

Similar to PDPs, an individual conditional expectation (ICE) plot shows the dependence between the target function and a feature of interest. However, unlike partial dependence plots, which show the average effect of the features of interest, ICE plots visualize the dependence of the prediction on a feature for each sample separately, with one line per sample.

Let’s see how to make these plots with Python.

Code example

First, we import the necessary libraries.

In this example we use the California housing dataset, whose aim is to predict the average house price in California block groups, using features such as the median income or the number of rooms per household. We’ll then train a RandomForestRegressor to predict house prices from these features, and finally, make partial dependence plots and individual conditional expectation plots using thePartialDependenceDisplay class from the sklearn.inspection module.

I added an explanation of the features in the dataset in the next code snippet. In this article, we mainly deal with these features:

AveOccup: the average number of household members in the block group.
MedInc: the median income in the block group.

Here are some sample feature values from the dataset. The target feature, i.e. the average household price of each block group, is a float between 0 and 5, expressed in hundreds of thousands of dollars.

Sample data from the California housing dataset. Image by the author.

Let’s train a RandomForestRegressor that learns to predict the prices from the house features.

The last step is to make plots. We make them using the shap library, which contains several explainable AI related methods.

We then import the shap library and create a small subset of the training data, in this case with 100 samples.

Next, we make the partial dependence plots using the partial_dependence function of the shap.plots module, and passing as arguments:

The feature of interest (AveOccup).
The prediction function (model.predict ).
A dataset (X100 ).
Whether to make a partial dependence plot or an individual conditional expectation plot (with ice ).
Whether to plot also the average model prediciton ( model_expected_value) and the average feature value ( feature_expected_value).

This function iterates over all the samples in X100 and, for each sample, calls the model.predict function many times with different values of the target feature but keeping fixed the complement features (i.e. all the other features). The resulting plot shows the average output of the model for each target feature value, over the whole dataset.

Partial Dependence Plot with the “AveOccup” target feature. Image by the author.

In this plot, we see that the expected model prediction is high when AveOccup is below 2, then it quickly decreases until AveOccup is 4, and remains basically constant for higher AveOccup.

Let’s do the same for the MedInc feature.

Partial Dependence Plot with the “MedInc” target feature. Image by the author.

It looks like the average predicted household price increases as the median income increases.

Let’s now try individual conditional expectation plots. We can make them using the partial_dependence function again, but this time with the ice parameter set to True .

Individual Conditional Expectation Plot with the “AveOccup” target feature. Image by the author.

The result still shows the average model prediction over variations of the AveOccup features, which is the darker blue line. However, ICE plots also show the output variations for each sample, which allows us to see if there are samples with different feature interactions.

For example, at the top of the chart, we can see that there are block groups for which the model predicts a high price which doesn’t change much with variations of the AveOccup feature. These samples would be interesting to further investigate.

Similarly, let’s compute the ICE plot for the MedInc feature.

Individual Conditional Expectation Plot with the “MedInc” target feature. Image by the author.

Again, the model follows the rule that a higher MedInc indicates a higher price for the majority of the samples, but still there are some exceptions that would be interesting to investigate.

Let’s now put away our model and analyze the training data to find relations between AveOccup, MedInc, and the block group price. We create a scatter chart where x is AveOccup, y is MedInc, and each sample color represents the block group price.

Can you guess what the chart looks like?

Plot of the relations between AveOccup, MedInc, and the block group price on the California housing dataset. Image by the author

Indeed, samples with low AveOccup and high MedInc seem to have a higher price, which is what we saw that the model learned, thanks to partial dependence and individual conditional expectation plots. It looks like the model has learned meaningful rules 🙂

Conclusions and next steps

In this article, we saw what partial dependence plots (PDP) and individual conditional expectation (ICE) plots are, and how to make them in Python with a regression example on the California housing dataset.

Possible next steps are:

Learn mode about explainable AI with LIME and SHAP.
Try partial dependence plots on your projects and analyze the relationships that your models are learning.

Thank you for reading! If you are interested in learning more about NLP, remember to follow NLPlanet on Medium, LinkedIn, and Twitter!

Two minutes NLP — Partial Dependence and ICE Plots

Explainable AI with sklearn.inspection

Partial Dependence Plots

Individual Conditional Expectation Plots

Code example

Conclusions and next steps

Two minutes NLP — GLUE Tasks and 2022 Leaderboard

Single-sentence tasks, similarity and paraphrase tasks, and inference tasks

BERT finetuning with Hugging Face and training visualizations with TensorBoard

AutoTokenizer, AutoModel, Trainer, TensorBoard, datasets, and metrics

Two minutes NLP — Beginner intro to Hugging Face main classes and functions

Pipeline, Datasets, Metrics, and AutoClasses

Written by Fabio Chiusano