Visualizing Activation Heatmaps using TensorFlow

Published in

Analytics Vidhya

5 min readJan 25, 2020

It can be beneficial to visualize what the Convolutional Neural Network values when it does a prediction, as it allows us to see whether our model is on track, as well as what features it finds important. For example, in determining whether an image is human or not, our model may find that facial features are determining factors.

To visualize the heatmap, we will use a technique called Grad-CAM (Gradient Class Activation Map). The idea behind it is quite simple; to find the importance of a certain class in our model, we simply take its gradient with respect to the final convolutional layer and then weigh it against the output of this layer.

Francois Chollet, the author of Deep Learning with Python and the creator of Keras, says, “one way to understand this trick is that we are weighting a spatial map of how intensely the input image activates different channels by how important each channel is with regard to the class, resulting in a spatial map of how intensely the input image activates the class.”

This is the layout of using Grad-CAM:

1) Compute the model output and last convolutional layer output for the image.2) Find the index of the winning class in the model output.3) Compute the gradient of the winning class with resepct to the last convolutional layer.3) Average this, then weigh it with the last convolutional layer (multiply them).4) Normalize between 0 and 1 for visualization5) Convert to RGB and layer it over the original image.

Let’s start by importing what we need.

Now let’s load the model. Since the goal of this tutorial is how to generate an activation heatmap, we will just use the Inception V3 model, which is already pretrained. It is trained to classify many different classes.

This model takes in a 299x299image. According to Sik-Ho Tsang, at “42 layers deep, the computation cost is only about 2.5 higher than that of GoogLeNet and much more efficient than that of VGGNet.” It is a very deep network, which is why it is provided as a pretrained model in the Keras library. The following will print out the architecture of the model — although there are many computations, we are only looking for the final convolutional layer, which lies near the end of the list.

As we can see, the final convolutional layer is conv2d_93 for this model. Now let’s load some images to test and see what it looks like.

The following code downloads multiple images which will be used to demonstrate the Grad-CAM process.

Let’s visualize the image.

Now let’s preprocess the input to feed into our model. We will need to add a dimension to our image and preprocess it as well, using the preprocess_input function provided by tf.keras .

As we can see, the above picture was predicted as being Indian_elephant with a probability of .962.

Grad-CAM

Now we can start the Grad-CAM process. To start, we will need to define a tf.GradientTape, so TensorFlow can calculate the gradients (this is a new feature in TF 2). Next, we will get the final convolutional layer, which is the aforementioned conv2d_93. Then we will create a model (which behaves as a function) that takes as input an image (model.inputs) and outputs a list of the output of the model and the output of the final convolutional layer ([model.output, last_conv_layer.output]) for later use.

We will calculate the class output by indexing the model output with the winning class (np.argmax finds the index of the greatest value in the input). With this info, we can calculate the gradient between the class output and the convolutional layer output, which we will then average among all the axes. Lastly, we will multiply the two to get our final heatmap.

Now let’s visualize our heatmap. To do this, we will bring all the values between 0 and 1 and also reshape them to be an 8x8array.

Now let’s cover the image with the heatmap. First, we load the image.

Next, we resize the heatmap to match the shape of the image, so that it can properly impose it. The cv2.applyColorMap function allows us to apply the heatmap to our image (we multiply by 255to convert it into RGB form). We also multiply our heatmap by an intensity of our choosing, depending on how much we want our heatmap to cover the image.

Now let’s view our original image and our new image with the activation heatmaps.

The original picture (left) vs. Heatmap picture (right)

As we can see, the elephant’s head activated our model more than the rest of the image.

Let’s try it out on different images to see if it works. First, let’s compile all of our code into a function so it’s easy to use.

As is evident, our Grad-CAM function can accurately and precisely show us the activation heatmap of the model, telling us what the neural network “sees” and what it values when making its prediction. This could not only improve model explainability but accuracy as well.

References

https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/5.4-visualizing-what-convnets-learn.ipynb

https://stackoverflow.com/questions/58322147/how-to-generate-cnn-heatmaps-using-built-in-keras-in-tf2-0-tf-keras

Visualizing Activation Heatmaps using TensorFlow

Grad-CAM

Written by Areeb Gani