Understanding the predictions of convolutional neural networks: saliency maps

3 min readFeb 14, 2023

As data scientists, we often work with deep learning models to solve a wide range of problems. One of the challenges of working with deep learning models is understanding how they make predictions. In the case of convolutional neural networks (CNNs) used for image classification tasks, it is often unclear how the network is arriving at its predictions.

Introduction to CNNs

Let’s briefly review what convolutional neural networks (CNNs) are and how they work. CNNs are a type of deep learning model that are designed to process and analyze images. They consist of multiple layers, each of which performs a specific operation on the image.

The first layer is usually a convolutional layer, which applies a set of filters to the image to extract features. The output of the convolutional layer is then passed through one or more pooling layers, which downsample the image and reduce the dimensionality of the feature maps. Finally, the output of the pooling layers is passed through one or more fully connected layers, which make the final classification prediction.

The Challenge of Understanding CNN Predictions

While CNNs have been highly successful in image classification tasks, they are often treated as “black boxes” because it is not clear how they arrive at their predictions. This lack of interpretability is a major challenge when using CNNs in real-world applications, where we need to understand how the network is making its predictions. The “Deep Inside Convolutional Networks: Visualizing Image Classification Models and Saliency Maps, (ICLR 2014)” paper addresses this challenge by providing a way to visualize the feature activations of a CNN and identify the regions of an image that are most important for the network’s prediction.

Visualizing Feature Activations and Saliency Maps

The key contribution of the paper is the method for visualising the feature activations of a CNN and generating saliency maps that highlight the regions of the image that are most important for the network’s prediction.

The authors address two important questions in their work:

1. Saliency Map: Which pixels in an image are most crucial for the network to consider when making a decision?

The gradient of the output class score with respect to the input image is computed by using backpropagation. This gradient shows how much the network’s output changes when the input image does.

The gradient allows to determine which pixels in the input image have the greatest impact on the outcome of the prediction.

The regions of the image that are most crucial for the network’s prediction are highlighted in the saliency maps.

These saliency maps can be used to both identify the feature’s network is using to produce its predictions as well as to indicate any potential problem areas.

Saliency Map, Source: https://arxiv.org/pdf/1312.6034.pdf

Saliency Map of few images, Source: https://arxiv.org/pdf/1312.6034.pdf

2. Class Model: What does the neural network’s interpretation of the usual “dog” or “cat” or other target image class look like?

The authors suggest finding the ideal image that would maximize the class score while being constrained by its brightness:

Class Model, Source: https://arxiv.org/pdf/1312.6034.pdf

Class Model of few classes , Source: https://arxiv.org/pdf/1312.6034.pdf

Conclusion

By visualizing the feature activations of a CNN and generating saliency maps, we can gain insight into how the network is processing the image and identify the features that are most important for its prediction. This has important applications in image classification tasks, where it is important to understand how the network is making its predictions.