Interpretability in Computer Vision

Post-hoc VS Built-in methods

Published in

Towards NeSy

4 min readMar 27, 2023

In this blog, I introduce two main categories of methods that improve the interpretability for classification tasks in the computer vision field. To illustrate the Built-in category, I also give an example of ProtoPNet.

Background

In the last decade, with the development of deep learning, especially CNNs, the ability to automatically extract high-level abstract features from images has provided an effective solution for computer-aided classification. Deep CNNs such as VGGNet, ResNet, and DenseNet extract image features and perform classification by constructing network structures and convolutional kernels, using different convolutional kernels and network structures to extract features and classify images from different perspectives to achieve better classification results. Although the prediction performance of deep CNNs is much better than traditional learning models, it still has a common drawback: a lack of interpretability.

What’s interpretability?

In 2017, Google Brain proposed interpretability: Interpretation is the process of giving explanations to humans.[1]

There are two general methods improving interpretability：

Post-hoc: Train a black box model and apply post-hoc interpretability techniques to provide explanations. It means we need to use interpretable methods to understand the designed model and give decision dependencies after model training. Example: LIME (Local Interpretable Model-agnostic Explanation)[2]
Built-in: Use models that are intrinsically interpretable and known to be easy for humans to understand. We can clearly understand the model running process and decision mechanism with the model running. Example: ProtoPNet[3]

Details in ProtoPNet

ProtoPNet proposes a deep learning method for image recognition that is both accurate and interpretable. The authors introduce a novel deep learning architecture called the “Siamese Neural Network” that is able to compute the similarity between two images by comparing their feature embeddings, which are high-dimensional representations learned by the network. The network is trained to minimize the distance between embeddings of similar images and maximize the distance between embeddings of dissimilar images. During testing, the network can be used to compare an input image to a set of reference images and output the most similar image(s) from the set based on their computed similarity scores. The approach allows for a measure of interpretability in the image recognition process, as the similarity scores provide a quantifiable measure of similarity between images that can be used to explain the network’s predictions.

Architecture

Convolutional layers: Pass image x into convolutional layers to get features f(x).
Prototype Layer: Compute the similarity score between each prototype and each image patch embedding on f(x). Maxpool the similarity score matrix to a vector.
FC Layer: aggregate weighted similarity scores from each prototype, and get the classification logits.

How to calculate the similarity between the Prototype and the feature map?

During training, the network is fed pairs of images and is trained to output a similarity score between them. The similarity score is computed by taking the Euclidean distance or cosine similarity between the feature embeddings of the two images. The network is then optimized to minimize the distance between embeddings of similar images and maximize the distance between embeddings of dissimilar images.

Limitation

The paper primarily focuses on demonstrating the effectiveness of the proposed method on a small dataset, but does not provide extensive examples of real-world applications or use cases for the model.

Reference

[1]B. Kim, Google Brain, Interpretable Machine Learning (ICML 2017)

[2]Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. “ why should I trust you?” explaining the predictions of any classifier (22nd ACM SIGKDD)

[3] Chaofan Chen, Oscar Li, Daniel Tao, Alina Barnett, Cynthia Rudin, and Jonathan K Su. This looks like that: deep learning for interpretable image recognition. Advances in neural information processing systems, 32, 2019