Explaining Trees with FastTreeShap and What if tool (Part-1)

4 min readMar 28, 2023

For many machine learning practitioners, the typical workflow looks like the following diagram.

Machine Learning Workflow (Image Credit: AWS)

In reality, this is just a small part of developing a robust machine-learning solution. Just getting a good evaluation performance is never enough! Especially in fields like finance or health, it’s important to explain the predictions of machine learning models.

Let’s say you are developing a deep neural network to detect melanoma, which is a malignant type of skin cancer. Your team collects image data from volunteers. You train a state-of-the-art model on the training data, and when you evaluate the performance of the model, you see an impressive 99.98% accuracy. You feel ecstatic. Now, would you go ahead and implement your model in hospital studies to screen for melanoma? Maybe you shouldn’t, at least not yet.

In many scenarios, just evaluating the model’s performance is not a rigorous way to decide if the model is really able to solve a task. A lot of things can go wrong. For instance, in the previous example, what if:

1. You did not use proper performance metrics; it’s usually a common problem with starters. You should know your metrics by heart! It doesn’t make sense to calculate the mean absolute error for a classification error, and it doesn’t help to calculate only accuracy for a cancer classification model. It’s not as disastrous if your model predicts some of the benign cases as malignant; however, if your model classifies many malignant samples as benign, it may result in catastrophe.

2. The model learned from noise in the dataset: So, it turns out that the team that was assigned to collect images using two different cameras in the process. They used camera 1 for all melanoma instances and the other device for benign samples. Your model may have focused on the inherent noise of each device because it’s easier to model.

You can read the famous AI folklore that tells a story about a neural network trained to detect tanks which instead learned to detect the time of day here!

In this tutorial, I will show you a tool that I used heavily for many of my machine-learning models. The What If tool is an interactive explainability plugin from the People + AI Research (PAIR) Initiative of Google. The What-If Tool (WIT) offers a simple user interface for understanding regression and black-box classification ML models better. The plugin allows you to execute inference on a big collection of samples and view the results right away in a number of different ways. In order to see the effects of the modifications, examples can also be manually or programmatically changed and then re-run through the model. It includes tooling for analyzing model fairness and performance across dataset subsets.

The tool’s goal is to provide users with a quick, easy, and effective approach to examining and investigating trained machine learning models using a visual interface with no coding whatsoever. The tool can be accessed directly in a Jupyter or Colab notebook or through TensorBoard. WIT is also readily available in user-managed notebooks of Vertex AI (Google Cloud).

Understanding the performance of machine learning (ML) systems across a variety of inputs is a major challenge in creating and deploying competent ML systems.

Using What If, you may test performance in fictitious scenarios, evaluate the significance of various data attributes, and display model behavior across various models and subsets of input data, as well as for various ML fairness measures.

Here, I will demonstrate the usage of WIT for the LightGBM model trained on a breast cancer classification dataset.

LightGBM is a gradient-boosting framework that makes use of tree-based learning techniques. It has the following benefits and is distributed and efficient by design:

Greater training efficacy and speed. LightGBM uses a histogram-based algorithm, i.e it buckets continuous feature values into discrete bins, which speeds up the training procedure.
More accuracy with less memory. By using a leaf-wise split strategy rather than a level-wise split approach, which is the primary element in getting higher accuracy, produces far more complicated trees. Setting the max depth option will prevent overfitting, which can occasionally result from this.
Support for distributed, parallel, and GPU learning.
Ability to manage large amounts of data. It replaces continuous values with discrete bins, which results in lower memory usage.

Because of these benefits, LightGBM is frequently used in numerous machine learning competition-winning solutions.

Read part-2 here!

The code used in this article can be found here: https://github.com/zabir-nabil/What-If-Explainability

Explaining Trees with FastTreeShap and What if tool (Part-1)

Written by Zabir Al Nazi Nabil