Trust, and don’t verify: the AI black box problem

The AI black box problem is a popular topic in the news — at a quite unpopular one in the AI community. Just a few headlines:

4 min readOct 1, 2017

“Demystifying the Black Box That Is AI” — Scientific American, August 2017

“Explainable Artificial Intelligence: Cracking open the black box of AI”, Computerworld, April 2017

“The Dark Secret at the Heart of AI: No one really knows how the most advanced algorithms do what they do. That could be a problem”. MIT review April 2017

“Making computers explain themselves”, MIT review October 2016

“Is Artificial Intelligence Permanently Inscrutable? Despite new biology-like tools, some insist interpretation is impossible”. Nautilus, Sept 2016

A Chief Medical officer will ask: “how can I trust something you don’t understand.” And consumer groups will promote a right for explanation (an EEC directive is already going in that sense).

But we are mixing very different things.

First, the black box problem

It’s a serious one but limited to large deep learning models (the neural networks).

Neural networks break large computation problems into millions or billions of pieces. It then advances step by step — an architecture famously inspired by our brain. Most AI breakthroughs since 2009 come from it.

As impressive as it is, it’s an engineer solution: throwing a massive amount of data and hardware to the problem.

As Geoffrey Hinton himself observed, the human brain doesn’t work that way. We don’t derive a simple understanding from a massive amount of data, we derive a massive amount of understandings from very few data. A child who sees an iPhone once will remember it. He will give it a name, find usages, make comparisons, connect it with his environment.

In short, we use abstractions. Deep learning uses calculations. A model can reach a correct conclusion via a path that has nothing to with what would be, for us, a logical view of the problem. And with 200 million calculations, we have no way to find out what was that path.

As some of the best experts say:

“Our results suggest that classifiers based on modern machine learning techniques … are not learning the true underlying concepts that determine the correct output label. Instead, these algorithms have built a Potemkin village.” (I. J. Goodfellow, J. Shlens, and C. Szegedy, in Explaining and harnessing adversarial examples).

But it works, observed some critics, so what is the problem? After all, we don’t understand the inner workings of human reasoning either.

The first obvious problem is high stake decisions — medical or military applications. They demand explainability and verification. Even if medical errors are responsible for a lot more deaths (251,000 a year in the US) than AI (0), it’s difficult to deploy systems if we are not quite sure what they do.

Many types of research are underway, triggered by DARPA funding since September 2016. One option is “pseudo-explanation”: the system document the main decisions points. But it still cannot say “why”.

But even for simple applications, the black box problem is a limitation. Data scientists make constant trade-offs between prediction and explainability. A simple model helps to understand what’s going on but makes crude predictions. A complex model will make good predictions but is difficult to analyze. Applications need a good trade-off: not only pick up the online fraudsters but understand their latest tactic to update the system. A deep learning model that leaves us in a dark is not a solution.

Next, the trust problem

I read many articles about the black box problem. Each mention deep learning then gives examples that have nothing to do with it.

They are still onto something. Even “classical” machine learning can become unstable with high dimensional problems. A marketing model may track 750,000 websites click. For the AI algorithm that’s 750,000D — as in 2D or 3D. Try to picture a 750,000D graph. Sometimes the maths can’t either. Researchers have added hundreds of clever tricks to help but not all of them are robust.

All of that fuels human skepticism. Many don’t believe a data-driven system can capture human realities. Very often the issue is not only the maths, it’s a breakdown of the (human) design between the AI model, the data available, and the understanding of the situation. For the CEO or the doctors, the result is the same: the system cannot be trusted.

It’s not really the maths

In practice, the real “black box” issue is rare outside of large projects started at Google, Facebook, or Amazon. The real question is explanation and uncertainty — trust.

AI is based on statistical thinking — the art of uncertainty — and some tend to forget it. Data scientists are good at pointing statistical fallacies in the news, less so to admit they can’t be sure of most of what they do.

A team can be confident the model is robust and still struggles to get its message across. Good storytellers can craft a message but arguing on p-values will not do a lot of good.

A CDO once told me his job meant Chief Diplomatic Officer. People may not trust the machine because they can’t see how it can have a real understanding of the world. And they don’t want to hear that their thinking is biased — or wrong. Addressing trust problems takes time to establish relationships and “show” more than “tell”.

Examples abound where the scientist’s dismissal of public concerns was the wrong strategy. Many AI teams will need to listen, explain themselves, and build trust.

Trust, and don’t verify: the AI black box problem

The AI black box problem is a popular topic in the news — at a quite unpopular one in the AI community. Just a few headlines:

First, the black box problem

Next, the trust problem

It’s not really the maths

Written by Philippe Hocquet