Attribution Heatmaps using Streamlit and XaiPient Explanations API

Youchun Zhang
xaipient
Published in
4 min readSep 10, 2020

At XaiPient we are constantly thinking about novel informative ways to shed light on ML model behavior. A core principle at XaiPient is that algorithm development and human-centric design play a complementary role in developing human-friendly explanations, and this has lead to fresh ways of elucidating ML model behavior using a combination of narrative text, visuals and tables.

XaiPient Customer Reporting Product Concept

One tool we have found invaluable in developing our demos, MVPs and products is Streamlit, which makes it extremely easy to quickly develop professional-looking apps driven by data and ML models. Without Streamlit, developing a good quality app demo would have taken several weeks of design work and front-end work (using frameworks like React for example), and constant back-and-forth between ML engineers, designers and front-end developers. Fortunately, using Streamlit, our ML engineers can easily and quickly put together a working web-app show-casing our core explanation API, with very few lines of code. Moreover, Streamlit makes it also easy for our designers to customize the styling and colors, and a recent update of Streamlit even allows our code-savvy designers or ML engineers to develop custom components.

One of the simplest types of insights into the behavior of an ML model comes from feature attributions: If we think of a model as a function F that, given an input feature vector x, produces a score s = F(x), a feature attribution maps the input x to a vector A(x) of the same dimension as x, where each component of A(x) represents the “importance”, or “contribution” of x to the score F(x). Thus A(x) gives an understanding of which features in x are the main drivers of the score F(x). There are a variety of notions of “importance”, leading to different ways of computing attributions, such as LIME, SHAP, Integrated Gradients, etc. (See our companion blog post series for an introduction to feature attributions, especially Integrated Gradients.)

There are many ways to visualize feature attributions for a specific input x graphically, but we wondered, is there a way to simultaneously see the attributions A(x) for multiple inputs x? To be more precise, suppose we have a model F trained on a tabular dataset, and we have some test dataset D. Can we show a tabular view of D, where in each row (representing some input x), the features are shown with “heatmap” colors representing the contribution of each feature according to A(x)? We call this an attribution heatmap, and it turns out it is easy to design this type of display in Streamlit.

Let us see how this works, using a model trained on the German Credit Risk dataset. Each row in this dataset represents attributes of a customer applying for a bank loan, such as their age, gender, job-type, savings account level, housing type, loan amount, etc. For a customer x, the model F(x) predicts their probability of being a good credit risk.

We will be using the following packages:

To start with, we will create two tables. One is the data table D. The second one is the feature attribution A table of the same dimensions as D, where for each row x in D, there is a row in A representing the attributions A(x). Here we will use random attributions values between -1 and 1, whereas for a real-world use-case this table can be generated using the XaiPient local attributions API.

Both tables will be created with the pandas DataFrame. The “Score” column is the model’s prediction. We will use the value of “Score” also as its attribution value.

For the heatmap styles, we will create five color ranges. Since our feature attribution value always falls in between -1 and 1, we will define any value between -1 to -0.2 to be a red color indicating negative attributions, and any value between 0.2 to 1 to be a blue color indicating positive attributions.

Color Range from Negative Attribution to Positive Attribution

Pandas DataFrame supports conditional formatting. It takes another DataFrame (usually it is the data within), in which each cell includes its corresponding CSS styles as the arguments of the df.style.apply method. In our case, we would like to populate this styling DataFrame using another dataset. The function my_styler below does that by generating the background css styles according to the color ranges we defined, while iterating over all the cells in the original DataFrame.

Now we can apply this styler to the attributions table. Use axis=None for the styling to be applied to the entire table.

Assuming all the above code is in a file app.py, you can then run the demo with

streamlit run app.py

Here is what the table looks like:

Attribution Heatmap

In a future post we will look at more custom components we’ve developed using our core explanation API and Streamlit.

XaiPient is fundamentally re-imagining AI explainability with the human end-user in mind. Partner with us: xaipient.com. Follow us on Twitter: @XaiPient

--

--

Youchun Zhang
xaipient

UI/UX Designer at XaiPient, ME at Cornell and MFA at Parsons