Unveiling the Decision-Making Process: Class Activation Maps.

Aria Bishma
4 min readAug 4, 2024

--

Deep learning has emerged as a game-changer in image classification. Unlike traditional machine learning models that rely heavily on handcrafted features, deep learning models, particularly CNNs, can automatically learn intricate features from vast amounts of data. This has resulted in significant improvements in classification accuracy.

While CNNs have achieved remarkable accuracy, their complex nature makes it difficult to understand how they arrive at their predictions. Class Activation Maps (CAMs) provide an intuitive way to visualize and interpret the decisions made by these models.

Class Activation Maps Explained

Imagine looking at a picture of a dog. Your brain instinctively focuses on key features like the eyes, ears, and whiskers to verify it’s a dog. Class Activation Maps (CAMs) highlight the crucial image regions that contribute to a Convolutional Neural Networks (CNNs) model’s prediction.

CAMs are works by applying the weights of a specific class from the output layer to the feature map extracted by the final convolutional layers.

How To Perform CAMs.

Class Activation Map

To generate Class Activation Maps (CAMs), we first extract the weights of a specific class that corresponded to each feature maps on the Global Average Pooling (GAP) layer. These weights provide information about the importance of features to the CNN model’s prediction.

They are then applied to the corresponding feature maps from the final convolutional layer and summed to produce a heatmap representing the CAM.

Example

For example, we are trying to generate Class Activation Map(CAMs) on ResNet50 model.

First import all necessary libraries such as Tensorflow, NumPy, Matplotlib, and OpenCV2.

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import cv2

Then load the ResNet50 model.

model = tf.keras.applications.ResNet50(weights='imagenet', include_top=True)

if we look at model.summary() it will look like this. it consist 1 output layer and also incorporate Global Average Pooling (GAP) layer at the last convolutional layer. so its possible to generate Class Activation Map(CAM) out of this model.

ResNet50 Model Summary

Then we will initiate a new model based on that ResNet50 model to make it easier to get the output from other layers.

resnet_model = tf.keras.models.Model(inputs=model.input, outputs=(model.layers[-1].output, model.layers[-3].output))

Next, we will import an image for example “dog.jpg” and preprocess it to be predicted by this model.

img_path = 'dog.jpg'
img = tf.keras.preprocessing.image.load_img(img_path, target_size=(224, 224))
img = np.array(img)
x = tf.keras.preprocessing.image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = tf.keras.applications.resnet50.preprocess_input(x)
Dog.jpg — Source:google.com

Then we will predict the category of that image.

preds, features = resnet_model.predict(x)
decoded_predictions = tf.keras.applications.resnet50.decode_predictions(preds, top=3)[0]

features = np.squeeze(features) # 7, 7, 2048
pred = np.argmax(preds)

Now, we’ve already get the prediction result (preds) and the image features from the last convolutional layers (features). To generate Class Activation Maps (CAMs), we need to get the weights of the predicted class from the output layer (prediction layer).

last_layer_weights = resnet_model.layers[-1].get_weights()[0]
last_layer_from_preds = last_layer_weights[:, np.argmax(preds)]

Then, we will resize our image features to its original size.

h = int(img.shape[0] / features.shape[0])
w = int(img.shape[1] / features.shape[1])
interpolated_filter = scipy.ndimage.zoom(features, (h, w,1), order=1)

Then, compute the weighted sum of the image features with the weights from the “last_layer_from_preds” variable to generate heatmap.

heatmap = np.dot(interpolated_filter.reshape((224*224,2048)),last_layer_from_preds).reshape(224,224)

plt.title("CAMs Heatmap")
plt.imshow(heatmap)
plt.colorbar()

Now, we’ve already generate the Class Activation Maps(CAMs) from ResNet50 model. Then we could overlap our CAMs heatmap to its original image .

heatmap_img = cv2.applyColorMap(np.uint8(255*(heatmap / np.max(heatmap))), cv2.COLORMAP_JET)
img_bgr = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)

overlapped_heatmap = cv2.addWeighted(img_bgr, 0.5, heatmap_img, 0.5, 0)
overlapped_heatmap_bgr = cv2.cvtColor(overlapped_heatmap, cv2.COLOR_RGB2BGR)

plt.title("CAMs Result")
plt.imshow(overlapped_heatmap_bgr)
CAMs Result

Limitation

CAMs are applicable only to a CNN models that incorporate Global Average Pooling (GAP) layer and a single output layer.

Source Code :

https://colab.research.google.com/drive/1SCQSsdEw96hhERuKSLh9MY054uFEul1K?usp=sharing

References

https://arxiv.org/pdf/1512.04150

https://www.pinecone.io/learn/class-activation-maps/

--

--