Gradient-weighted Class Activation Mapping - Grad-CAM-


A technique for making Convolutional Neural Network (CNN)-based models more transparent by visualizing the regions of input that are “important” for predictions from these models — or visual explanations

Proposed approach


The gradient of the loss (for category cat) wrt the input pixels gives,

Deconv and Guided Backprop


Modifying the base network to remove all fully-connected layers at the end, and including a tensor product (followed by softmax), which takes as input the Global-Average-Pooled convolutional feature maps, and outputs the probability for each class.
results in a coarse heat-map of the same size as the convolutional feature maps (14×1414×14 in the case of last convolutional layers of VGG and AlexNet networks)

Guided Grad-CAM

While Grad-CAM visualizations are class-discriminative and localize relevant image regions well, they lack the ability to show fine-grained importance like pixel-space gradient visualization methods (Guided Backpropagation and Deconvolution). For example take the case of the left image in the above figure, Grad-CAM can easily localize the cat region; however, it is unclear from the low-resolutions of the heat-map why the network predicts this particular instance is ‘tiger cat’. In order to combine the best aspects of both, we can fuse Guided Backpropagation and the Grad-CAM visualizations via a pointwise multiplication. GradCAM overview figure above illustrates this fusion.


Original source code : (pytorch)

an example of grad-cam code with keras


A live demo on Grad-CAM applied to image classification can be found at

Research assistant at the Perception, Robotics, and Intelligent Machines Group

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store