TruLens
Published in

TruLens

A Hands-on Introduction to Explaining Neural Networks with TruLens

Transparency and Attribution Methods

Enter TruLens

The Basics: Model Wrappers

# Tensorflow/Kerasfrom tensorflow.keras.applications.vgg16 import VGG16
from trulens.nn.models import get_model_wrapper
keras_model = VGG16(weights='imagenet')# Produce a wrapped model from the keras model.
model = get_model_wrapper(keras_model)
# Pytorchfrom torchvision.models import vgg16
from trulens.nn.models import get_model_wrapper
pytorch_model = vgg16(pretrained=True)# Produce a wrapped model from the pytorch model.
model = get_model_wrapper(
pytorch_model, input_shape=(3,224,224), device='cpu')

Input Attributions: Saliency Maps and Integrated Gradients

  • Saliency maps (Simonyan et al.) take the gradient of the network’s output at the predicted class with respect to its input at a given point.
  • Integrated gradients (Sundararajan et al.) addresses some potential shortcomings of saliency maps by aggregating the gradient over a line in the model’s input space that interpolates from a selected baseline to a given point. In practice, this is done by averaging the gradient taken at uniformly-spaced samples along the line.

Saliency Maps

from trulens.nn.attribution import InputAttribution
from trulens.visualizations import MaskVisualizer
beagle_bike_input = x# Create the attribution measure.
saliency_map_computer = InputAttribution(model)
# Calculate the input attributions.
input_attributions = saliency_map_computer.attributions(
beagle_bike_input)
# Visualize the attributions as a mask on the original image.
visualizer = MaskVisualizer(blur=10, threshold=0.95)
visualization = visualizer(input_attributions, beagle_bike_input)
Saliency map showing the image areas of a beagle picture found to be relevant by the algorithm
Visualized Beagle Class Explanation : Saliency Map

Integrated Gradients

from trulens.nn.attribution import IntegratedGradients
from trulens.visualizations import MaskVisualizer
# Create the attribution measure.
ig_computer = IntegratedGradients(model, resolution=10)
# Calculate the input attributions.
input_attributions = ig_computer.attributions(beagle_bike_input)
# Visualize the attributions as a mask on the original image.
visualizer = MaskVisualizer(blur=10, threshold=0.95)
visualization = visualizer(input_attributions, beagle_bike_input)
Integrated gradients used in a saliency map of a beagle image.
Visualized Beagle Class Explanation : Integrated Gradients

Generalizing Beyond Input Attributions: Attribution Flexibility

from trulens.nn.attribution import InternalInfluence
from trulens.nn.distributions import LinearDoi
from trulens.nn.quantities import MaxClassQoI
from trulens.nn.slices import InputCut, OutputCut, Slice
# Create the attribution measure.
ig_computer = InternalInfluence(
model,
Slice(InputCut(), OutputCut()),
MaxClassQoI(),
LinearDoi(resolution=10))
# Calculate the input attributions.
input_attributions = ig_computer.attributions(beagle_bike_input)

Internal Influence

from trulens.nn.attribution import InternalInfluence
from trulens.nn.distributions import PointDoi
from trulens.nn.quantities import MaxClassQoI
from trulens.nn.slices import Cut, OutputCut, Slice
# Create the attribution measure.
internal_infl_computer = InternalInfluence(
model,
Slice(Cut('block4_conv3'), OutputCut()),
MaxClassQoI(),
PointDoi())
# Get the attributions for the internal neurons at the
# 'block4_conv3' layer. Because 'block4_conv3' contains 2D feature
# maps, we take the sum over the width and height of the feature
# maps to obtain a single attribution for each feature map.
internal_attrs = internal_infl_computer.attributions(
beagle_bike_input).sum(axis=(1,2))
from trulens.nn.attribution import InternalInfluence# Create the attribution measure.
internal_infl_computer = InternalInfluence(
model, 'block4_conv3', 'max', 'point')
# Get the attributions for the internal neurons at the
# 'block4_conv3' layer. Because 'block4_conv3' contains 2D feature
# maps, we take the sum over the width and height of the feature
# maps to obtain a single attribution for each feature map.
internal_attrs = internal_influence_computer.attributions(
beagle_bike_input).sum(axis=(1,2))
top_feature_map = int(internal_attrs[0].argmax())

Visualizing Important Internal Neurons

from trulens.nn.attribution import InternalInfluence
from trulens.visualizations import MaskVisualizer
input_infl_computer = InternalInfluence(
model, (0, 'block4_conv3'), top_feature_map, 'point')
# The above is shorthand for:
# infl_input = InternalInfluence(
# model,
# (InputCut(), Cut('block4_conv3')),
# InternalChannelQoI(top_feature_map),
# PointDoi())
input_attributions = input_infl_computer.attributions(
beagle_bike_input)
# Visualize the attributions as a mask on the original image.
visualizer = MaskVisualizer(blur=10, threshold=0.95)
visualization = visualizer(input_attributions, beagle_bike_input)
Internal neuron contribution to a saliency map of a beagle image.
Visualized Beagle Class Explanation : Internal Neuron
from trulens.visualizations import ChannelMaskVisualizervisualizer = ChannelMaskVisualizer(
model,
'block4_conv3',
top_feature_map,
blur=10,
threshold=0.95)
visualization = visualizer(beagle_bike_input)

Other Quantities of Interest

from trulens.nn.attribution import InternalInfluence
from trulens.visualizations import ChannelMaskVisualizer
BIKE_CLASS = 249# Create the attribution measure.
internal_infl_computer = InternalInfluence(
model, 'block4_conv3', BIKE_CLASS, 'point')
# The above is shorthand for
#
# infl_bike = InternalInfluence(
# model,
# Slice(Cut('block4_conv3'), OutputCut()),
# ClassQoI(BIKE_CLASS),
# 'point')
# Get the attributions for each feature map.
internal_attrs = internal_infl_computer.attributions(
beagle_bike_input).sum(axis=(1,2))
# Find the index of the top feature map.
top_feature_map_bike = int(internal_attrs[0].argmax())
# Visualize the top feature map in the input space.
visualizer = ChannelMaskVisualizer(
model,
'block4_conv3',
top_feature_map_bike,
blur=10,
threshold=0.95)
visualization = visualizer(beagle_bike_input)
Internal neuron contribution to identify the bicycle parts of an image that contain both a bicycle and a beagle.
Visualized Bike Class Explanation : Internal Neuron

References

  1. Leino et al. “Influence-directed Explanations for Deep Convolutional Networks.” ITC 2018. arXiv
  2. Simonyan et al. “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps.” 2014. arXiv
  3. Sundararajan et al. “Axiomatic Attribution for Deep Networks.” ICML 2017. arXiv

--

--

TruLens provides explainability for neural network machine learning models. We’re building a community of developers that are driving AI forward.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Klas Leino

Klas received his PhD at CMU studying the weaknesses and vulnerabilities of deep learning; he works to improve DNN security, transparency, and privacy