Dark AI and the Promise of Explainability (Part II)

Sheldon Fernandez

Published in

DarwinAI

9 min readMar 3, 2020

AI’s ‘black box’ may be the greatest business and societal risk of our time. Here’s how we bring it to light.

by Sheldon Fernandez

As was illustrated in Part I of this piece, existing approaches to explainability are limited in important ways. Some are indirect, while others lack stability. Some require subjective interpretation, while others lack verifiability.

In other words, to the extent that AI is black box, current solutions are but tiny pinholes in its protective coating, facilitating small but insufficient rays of sunlight through deep learning’s dark exterior.

How can we more fully illuminate the abyss?

At the NeurIPS 2019 in Vancouver, DarwinAI presented a seminal paper that benchmarked explainability across a variety of methods (LIME, SHAP, and Expected Gradients) including our own proprietary approach. The crux of the paper centered on a key question: to what extent does the explanation surfaced by an explainability algorithm reflect the actual decision-making process of a neural network?

We showed that by subjecting a deep learning network to a clever psychology test — using a counterfactual approach to remove explanatory variables and having the network reevaluate the result — we could determine the efficacy of a given approach.

Counterfactural explanations attempt to describe how changes in explanation drivers change model outcomes, and are an increasingly important branch of explainability research.

In our study, we measured the impact of explainability algorithms by removing critical inputs (as reported by the prevailing algorithm) from the model and observing how the model’s decision changed: if the identified factors were indeed crucial to the network’s decision-making process, then the absence of such factors should cause the network to be significantly less confident in its decision or to come to a different decision entirely.

Example: Removing a critical region

By way of example, the three images below depict:

the original input image (left)
the identified critical factors (center)
prediction confidences for decisions with critical factors removed (right)

In this instance, the absence of critical factors leads to a change in decision (from iPod to sweatshirt), which suggests such factors are indeed significant as regards the decision.

By employing such counterfactual techniques, we demonstrated that despite the popularity of LIME, SHAP, and Expected Gradients, they can be quite poor in identifying the factors that best reflect the neural network’s decision-making process. Moreover, the study illustrated that our own algorithm substantially outperforms these more common methods. This is hugely significant and important finding, which we hope lays the foundation for an interpretable approach to deep learning that greatly advances the efficiency and robustness of neural network design.

Building a White Box: GenSynth Explain

DarwinAI’s core technology is termed Generative Synthesis, or ‘GenSynth’ for short. The byproduct of years of scholarship from our academic team, Generative Synthesis uses AI itself to obtain a deep understanding of a neural network. This understanding forms the lynchpin of our explainability algorithm, GenSynth Explain.

GenSynth Explain gives rise to explainable deep learning whereby developers can understand, interpret, and quantify the inner workings of a deep neural network, creating new possibilities for accelerated and automated neural network design, debugging and error/bias mitigation, and regulatory transparency.

We devised GenSynth Explain to improve upon the shortcomings of existing explainability techniques. In constructing our own explainability platform we aimed to produce a platform which:

Captures the inextricable link between data and models: in the words of our Chief Scientist, “There is no data understanding without model understanding and no model understanding without data understanding.”
Optimizes model design and development: the approach should leverage model understanding to automate neural network design against key performance indicators.
Accurately explains the way the model makes a decision: that is, in the absence of these variables, the prediction being made should appreciably change
Quantitatively explains the way the model makes a decision: the algorithm should produce meaningful and actionable outputs which developers can use to make their models better
Reflects model intuition direction regardless of how it ‘reflects back’ to us: the process should articulate model reasoning authentically as human intuition can differ significantly from model intuition

We achieved these goals by employing a ‘white box,’ learning-based approach, the foundation of which is our core Generative Synthesis technology. In essence, we can leverage the ‘Inquisitor’ component of our technology to probe a neural network to garner an understanding of its inner workings. Put another way, we obtain a direct and global understanding of the neural network’s decision-making process using AI itself.

The benefits of this approach are twofold. First, as demonstrated in the NeurIPS paper described above, GenSynth Explain more accurately reflects the network decision-making process when compared to existing state-of-the-art techniques including LIME, SHAP, and Expected Gradients. Second, the quantitative insights obtained on the model’s inner workings can be employed to guide and automate better network design.

Making Explainability Real

“Explainability has been more of an academic concept than a concrete technology. With Generative Synthesis, we made explainability tangible and at the heart of how we deliver faster, more accurate, and more transparent deep learning.” –Dr. Alexander Wong, Chief Scientist, DarwinAI

Our goal at DarwinAI is to convert nebulous concepts around explainability into concrete tools that help developers:

Design better models, effectively and efficiently
Measure bias
Meaningfully apply quality assurance to make models more robust

In this spirit, we leveraged our research to integrate three important explainability features into our platform: efficient model design, data recommendations and actionable analysis.

Efficient Model Design

Our foray into explainability began almost two years ago by way of our toolset for neural network design and performance analysis. As detailed in our inaugural paper, the Generative Synthesis process obtains a deep understanding of neural networks by way of an inquisitor that garners deep insights about its inner behavior. These insights are leveraged in two important ways:

Performance Explanability

The insights are translated into analytical explanations that provide a detailed breakdown of how a network performs:

Per the figure above, such granular feedback — what we might term ‘model performance’ explainability — allows a developer to improve their network design for efficiency and accuracy.

Automated Model Design & Generation

The insights obtained through performance explainability can be leveraged to automatically create new and more efficient neural networks models that are deployment-ready.

That is, the ‘generator’ component of our platform appropriates these insights into a unique generative process that is not constrained by architecture priors or training priors and thus has unlimited flexibility to devise new network architectures and training strategies. The results can be noteworthy, as illustrated by this Intel case study that showed order-of-magnitude speedups when Darwin-generated networks were run on their chipsets.

The generative process is also powerful in light of a very common practice in deep learning: leveraging public models originally designed for generic tasks and customizing them for specifics ones. Quite often, this approach, will involves repurposing an off-the-shelf network for a proprietary task and dataset, is painstakingly arduous and can take months to carry-out producing a poorly performing model to boot. Using Generative Synthesis, however, a customized, performant and production-ready model can be created in a matter of days.

While this granular approach is a good start — and has produced important efficiencies for our clients — the multifaceted challenge of explainability warrants additional tooling.

Actionable Data Recommendation

A tremendous amount of time and effort is spent collecting and annotating the datasets to train neural networks. Unfortunately for model designers, quality data is often expensive and in short supply.

To address these commercial challenges, the second stage of GenSynth Explain consists in identifying the gaps in a dataset so as to determine unlabeled data that could be labeled to increase model accuracy and robustness. This capability is especially valuable to organizations for two reasons:

They are spending substantial amounts of money on labelled data, with little insight into its effectiveness
They possess tremendous amounts of unlabeled data (often collected in the field) but have no way of leveraging it or prioritizing their labeling activities

GenSynth Explain addresses these challenges directly. By delving into the internals of the network, the platform is able to understand how the network views and leverages training data. This insight can be extrapolated into identifying specific types of data to improve performance. From there, a nebulous collection of unlabeled data samples can be ranked in order of importance and surfaced to designers. Such a targeted approach is in contrast to the not uncommon practice of throwing as much data as possible at the problem and hoping for the best.

The results of this undertaking are promising. In early POCs with our clients, we’ve reduced the amount of labeled data needed to effectively train a model by 35 to 94 percent — which translates directly into reduced costs.

Here’s how Drew Gray, CTO of Voyage Auto characterized the data challenges facing his team and the impact of GenSynth Explain:

“To date, much of the work and available tooling around explainability has been subjective and quite abstract. With this technology, the Darwin team have translated their impressive research into a promising feature-set that is both practical and powerful, resulting in intelligible insights that we can act upon. Their GenSynth Explain technology was able to recommend the data to annotate to improve model performance. Given the effort in preparing data for deep learning–collating, cleansing, annotating–these results are very promising and highlight the way in which explainability can be made real for companies like ours. We look forward to working with them on this important technology.”

Actionable Analysis

Sometimes the prediction of a neural network will differ from that of a human labeler. And, on occasion, this will involve an ambiguous boundary case.

Consider the example below in which a person is walking alongside a bicycle. In this instance, the human labeled the subject as a “Pedestrian” whereas the network classified them as a “Cyclist”.

In such cases, it is useful to identify the salient inputs — or combination of inputs — behind the network’s decision (e.g., often a network will make the right decision for the wrong reasons).

To address such edge cases, our explainability offering allows developers to ‘zoom in’ on such scenarios to analyze the reasoning behind a particular inference (in this case, the highlighted bicycle beside the person leads the model to think they’re a cyclist). In addition, we are working on features to ‘cluster’ such edge-cases and describe their commonalities; identifying, for example, why the network consistently misclassifies someone standing next to a bike.

The Road Ahead…

As the proliferation of AI continues — as the machinery is bent and molded in the service of our goals — concern over how a neural network makes its decisions, particularly for life-critical applications, will become increasingly pronounced.

What should be evident from this analysis is that the black box problem is much more than a mild annoyance a developer must endure, but rather a dangerous dilemma akin to driving a race car with severely impaired vision: without insight into the nature of what they are building, the construction process will remain fragile and painstakingly arduous.

In this thorny area does our expertise lie.

Based on years of scholarship from our academic team, we’ve begun constructing eyeglasses for the deep learning community. If 2019 was about surfacing the black box problem in a general way, this year is about devising concrete ways to provide explainability in practical commercial settings.

With GenSynth Explain, we’ve demonstrated that it is possible to more accurately understand how a neural network makes its decisions, unlocking new possibilities for the commercial use of deep learning and allowing developers to build AI they can trust.

As the giants of our initial thought experiment might attest, the potential of new machinery is limited only by our understanding of how it works. For from this understanding flows additional innovation, tangential application, and true impact.

We look forward to illuminating this dark corridor of the deep learning journey and the commercial advancements that will result.

Stay tuned for an exciting year…

If you’d like to learn more about work, checkout our website or contact us. Additionally, if you find yourself in San Francisco the last week of March, be sure to check out MIT Tech’s AI and Trust conference and the AI Bias Interactive Panel I’ll be speaking at.