What Do We Want From Explainable AI?

4 min readAug 11, 2017

How can we explain why machine learning systems make the predictions that they do?

Before we can answer this question of explainable AI — one that Will Knightrecently described as “The Dark Secret at the Heart of AI” — we need to take a long hard look at what exactly we mean by ‘explaining’ things.

Expert systems vs modern machine learning

If we look back at the expert systems of the 80’s, we had what we would consider complete explainability: an inference engine leveraged a knowledge base to make assertions that it could explain using the chain of reasoning that led to the assertion.

These systems were completely built on subject matter expertise and while powerful, were somewhat inflexible. Expert systems were largely an artificial intelligence endeavor and not a machine learning endeavor.

Modern machine learning algorithms go to the opposite end of the spectrum, yielding systems capable of working purely from observations and creating their own representations of the world on which to base their predictions. But there is no ability to deliver explainable AI, or to present those representations in a meaningful way to a human who asks, “why?”

Why are we asking “why?”

Our current techniques are quite powerful, but do not engender a lot of confidence in the systems. We’re left wondering, “how do we verify behavior in novel situations?”, “How do we iteratively debug and refine these systems?”, “How do we prevent systemic, undesired biases that may be present in data but we want to filter out?”

If you are building a system to approve loans, for example, and the machine learning system learns from the data that zip codes are a good predictor for credit, are you running afoul of non-discrimination rules? What if you can’t even say what factors are ultimately used?

These are non-trivial, real-world concerns as we work to apply AI to our businesses. When my cofounder and I were first starting Bonsai, we spoke to a lot of large companies to learn about the problems they faced in using AI. One company, comprised of quite capable AI practitioners, had built a machine learning system to replace the hand-crafted system used to predict what item to next show its users. The new ML system, in measured tests, outperformed the existing system, but they ultimately decided not to adopt it. The reason was that despite the improved performance, they could not explain why it made suggestions, and consequently they were hard pressed to iteratively improve it. It presented a potential ceiling for what they could do, and so despite their successful work, it was not put into production.

These issues become even more pronounced as we look at systems making health recommendations, autonomously controlling vehicles, and providing operational decision support. Analysts don’t want to simply be told that this is what will optimize their supply chain, they want an explanation why an AI made these suggestions.

What do we want the machines to explain to us?

Many of us here are familiar with the constant refrain of “why” from children delving deeper and deeper into something. For most situations, when we ask why, we don’t want an explanation in terms of the underlying particle physics… we want an answer at the appropriate level of abstraction. (Though here’s a pretty amusing story showing what happens when a young child continuously asks “Why?” of her chemistry professor father.)

This can be a challenge in the real world because we have to distinguish between introspection and justification, and we must frame things in terms of shared, mutually held concepts that build upon each other. This is why children keep asking why… they don’t yet have all of those concepts.

If my son asks me why he must eat his vegetables and I say “because it is a healthy food that helps you grow” and he asks why again I can start to explain the nuances of nutrition or I can just say “because I said so”… one is introspection and one is justification.

Justification is more common than one might imagine. For much of our own behavior, we cannot point to a rational chain of deduction for a particular outcome, but we can seek plausible justifications for it. This is so natural that we do it without even thinking about it. But it has large implications for the way we work toward programming explanations into our machine learning systems.

Do we want the systems we build to be able to explain the features they’ve learned and how they were applied? Do we want the system to justify why a prediction was reasonable? Do we want both? How can we work to achieve these objectives?

There is quite a bit of research being done to answer these questions, with techniques generally falling under 3 categories:

  • Deep Explanation — Trying to tease out what a neural network is doing through introspection or justification
  • Model Induction — Looking at the behavior of a resulting trained system and using that to infer the model that can be used to explain the behavior
  • Machine Teaching and Recomposability — a method we use at Bonsai, combining machine learning with subject matter expertise to teach intelligent systems using conceptual hierarchies

If you’d like to learn more about these techniques, you can view a recent talk (link below) I gave at the O’Reilly AI conference in San Francisco. I’ll also be posting blogs describing each method in detail over the next few weeks.

To learn more about the work we do at Bonsai, you can visit https://bons.ai/.