TruLens
Published in

TruLens

Overfitting and Conceptual Soundness in Neural Networks

How feature usage informs our understanding of overfitting in deep networks

Photo by Daniel Andrade on Unsplash

An Illustrative Example

Samples of famous faces from the Faces in the Wild dataset, including Tony Blair, George W. Bush, and Colin Powell.
Sample of LFW training points. We see the image in the top right corner has a distinctive pink background. (Image from Leveraging Model Memorization for Calibrated White-Box Membership Inference)
from trulens.nn.models import get_model_wrapper# Define our model.
x = Input((64,64,3))
z = Conv2D(20, 5, padding='same')(x)
z = Activation('relu')(z)
z = MaxPooling2D()(z)
z = Conv2D(50, 5, padding='same')(z)
z = Activation('relu')(z)
z = MaxPooling2D()(z)
z = Flatten()(z)
z = Dense(500)(z)
z = Activation('relu')(z)
y = Dense(5)(z)keras_model = Model(x, y)# Compile and train the model.
keras_model.compile(
loss=SparseCategoricalCrossentropy(from_logits=True),
optimizer='rmsprop',
metrics=['sparse_categorical_accuracy'])
keras_model.fit(
x_train,
y_train,
epochs=50,
batch_size=64,
validation_data=(x_test, y_test))
# Wrap the model as a TruLens model.
model = get_model_wrapper(keras_model)
from trulens.nn.attribution import InternalInfluence
from trulens.visualizations import HeatmapVisualizer
layer = 4# Define the influence measure.
internal_infl_attributer = InternalInfluence(
model, layer, qoi='max', doi='point')
internal_attributions = internal_infl_attributer.attributions(
instance)
# Take the max over the width and height to get an attribution for
# each channel.
channel_attributions = internal_attributions.max(
axis=(1,2)
).mean(axis=0)
target_channel = int(channel_attributions.argmax())# Calculate the input pixels that are most influential on the
# target channel.
input_attributions = InternalInfluence(
model, (0, layer), qoi=target_channel, doi='point'
).attributions(instance)
# Visualize the influential input pixels.
_ = HeatmapVisualizer(blur=3)(input_attributions, instance)
Explanation of a how a neural network model is identifying an image, showing that the background was highly relevant to the identification.
(Image by Author)
from trulens.visualizations import ChannelMaskVisualizer
from trulens.visualizations import Tiler
visualizer = ChannelMaskVisualizer(
model,
layer,
target_channel,
blur=3,
threshold=0.9)
visualization = visualizer(instance)
plt.imshow(Tiler().tile(visualization))
(Image by Author)
Image of Tony Blair showing how a deep learning model determined its identification. In this case, the facial features are highlighted, indicating that the model is working correctly.
(Image by Author)

Catching Mistakes with Explanations

Two images of Gerhard Schroeder, showing how the image on the right can be altered so that a poorly calibrated neural network model can misidentify the image as Tony Blair.
Original image of Gerhard Schroeder from LFW (left) and edited version (right). (Image by Author)
>>> keras_model.predict(original).argmax(axis=1)
array([3])
>>>
>>> keras_model.predict(edited).argmax(axis=1)
array([4])
(Image by Author)
>>> keras_model_no_pink.predict(original).argmax(axis=1)
array([3])
>>>
>>> keras_model_no_pink.predict(edited).argmax(axis=1)
array([3])

Other Implications of Overfitting

Summary

References

  1. Leino et al. “Influence-directed Explanations for Deep Convolutional Networks.” ITC 2018. ArXiv
  2. Leino & Fredrikson. “Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference.” USENIX 2020. ArXiv

--

--

TruLens provides explainability for neural network machine learning models. We’re building a community of developers that are driving AI forward.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Klas Leino

Klas received his PhD at CMU studying the weaknesses and vulnerabilities of deep learning; he works to improve DNN security, transparency, and privacy