Published in


Machine Learning Explainability and Robustness: Connected at the Hip

Tutorial for explainability for neural network and machine learning models, featured at KDD Singapore 2021

Last week, we presented a tutorial at KDD 2021 on explainability and robustness of deep neural networks and the surprising relationships between these two concepts.

You can find the tutorial slides & outline here. Stay tuned for a video sequence that we will release shortly! I created the tutorial with Matt Fredrikson, Klas Leino, Caleb Lu, Shayak Sen, and Zifan Wang — a team that has been researching these topics for over 5 years.

Tutorial Highlights

  1. Foundations of XAI: This section provides guidance on which explanation frameworks are appropriate for use and under what conditions. We highlight requirements for “good explanations”, in particular, explanation accuracy (correctly capturing drivers of model behavior), explanation generality (answering a rich set of queries about model behavior), and interpretation devices (e.g. visualization methods that make the explanations meaningful to humans), as well as how various approaches meet these requirements. We include a quick tour of gradient-based explanation methods, including Saliency Maps, Integrated Gradients, and Influence-directed Explanations.
  2. TruLens Open Source Library: We present TruLens, a new explainability library for deep neural networks. Go play with it! Distinctively, TruLens provides a uniform API to work with models built with PyTorch, Tensorflow and Keras. You can check out the following CoLab notebooks to get started with using TruLens:

Building on Trulens, we also share a demo of Boundary Attributions, an. explanation method that takes into account decision boundaries and is suitable for accurately explaining classification decisions. You can get started with the demo Colab notebook here:

3. Foundations of Adversarial Robustness: We introduce concepts surrounding adversarial robustness, including state-of-the-art adversarial attacks as well as a range of corresponding defenses. If you are working with deep neural networks and would like to make your models robust, you will find this section useful.

4. Connecting Explainability & Adversarial Robustness: We present key insights from the recent literature on the surprising connections between explainability and adversarial robustness: (a) We show that many commonly-perceived issues with explanation methods are actually caused by a lack of robustness of the model; (b) we also show that a careful study of adversarial examples and robustness can lead to models whose explanations better appeal to human intuition and domain knowledge. This section highlights the importance of making models robust to ensure that you get meaningful explanations. As the tutorial title suggests, explainability & robustness are connected at the hip!



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Anupam Datta

Passionate about enabling effective and responsible adoption of AI. Co-Founder, Chief Scientist, TruEra; Professor, CMU; PhD CS Stanford