Published in


Peeking in the Black Box — A Design Perspective on Comprehensible AI — Part 1

A desk with various item and a black box labeled “AI”
AI as a black-box © 2021 Henner Hinze

“Look out, robots, because we’re brave, we are hungry for action, and we’re strapped in for success. And we have no idea what we’re doing.” – The Mitchells vs. the Machines (Sony Pictures Animation, 2021)

AiX Design © 2021 Henner Hinze

In recent years, the predictive accuracy of Artificial Intelligence (AI) technologies has tremendously increased due to the advent of powerful algorithms like neural networks with millions of automatically learned parameters. However, this has come at a cost — in comparison to “classical” approaches like rule-based systems and linear regression these novel approaches, due to their inherent complexity, are significantly less transparent and harder to interpret. Hence, we consider them “black box” systems. This lack of explainability limits their usefulness in quite a few practical applications. This is a recognized but still unsolved problem in many respects, investigated under the term eXplainable AI or XAI.

Limiting the discussion around XAI to a technological exercise or for the sole purpose of legal compliance and trust-building would miss its larger potential. I believe, when XAI is considered an integral part of AiX design, it presents an opportunity to set AI on a path to not only become explainable but informative — even educational — to its users.

Listen to this article © 2021 Henner Hinze

Other articles in this series

Why bother?

Let us imagine we worked in a forensic lab — much like the ones seen on TV procedurals like “CSI”. The police have provided us a blurry image from a surveillance camera revealing a perpetrator’s face.

Low-resolution image of a person’s face
Source: Wikipedia (down-scaled by the author)

Our task is now to improve the quality of the image so it can be matched with the police database. The infamous fictional tool for the job is called ‘Zoom-and-Enhance’. In principle, it should not be possible to upscale the image this way as this would require creating information that is not present in the original material. But modern machine learning techniques allow us to tackle this problem anyway. Using Face Depixelizer we have upscaled the pixelated image to obtain a high-resolution image that looks very plausible. Job well done!

Low-resolution image on the left, high-resolution image on the right reconstructed with Face Depixelizer
High-resolution image created with Face Depixelizer by the author

But hold on! You might have spotted already that the original image is in fact of former U.S. President Barack Obama. So, what is happening here? The model used for the transformation has been trained on pairs of pixelated images and their corresponding high-resolution versions. The AI does not truly scale the original pixelated image. It reconstructs a new image from a combination of the high-resolution images it has seen during training whose pixelated counterparts are most similar to ours. This is a useful tool for artistic purposes but utterly inappropriate for the forensic use case. Without understanding the implications of the underlying mechanism we would have gone and prosecuted some random innocent person.

This example is fictional, but AI comprehensibility has real-life consequences (See some examples in Weapons of Math Destruction by Cathy O’Neil; The example above is inspired by Boris Müller, 2021).

Proof of performance is not enough

One could make the argument that AI does not need to be comprehensible to be trustworthy if it can be proven to perform accurately. Cassie Kozyrkov, Chief Decision Scientist at Google, makes exactly this argument in ‘Explainable AI won’t deliver. Here’s why.’ (Kozyrkov C, 2018).

I give a satirical outlook on the possible consequences of this stance in my short story ‘Reply Hazy. Try Again.‘
But in all seriousness, I believe there to be flaws in this argumentation. While performance is a crucial element in trust-building (Lee & Moray, 1992), it is not the only relevant factor.

Kozyrkov uses an analogy asking which of two spaceships we would trust to use, the one that is theoretically sound but has not been flown yet (well understood but untested) or the one that has proven to perform safely in years of successful flights (poorly understood but well tested). She prefers the latter. This analogy begs two questions:

  1. On what grounds were spacefarers supposed to trust the second spaceship when it did not have years of service performed yet? Because this is the situation with all newly introduced AI systems.
  2. If the second spaceship has been flying for years, when has it run its time, and is not safe to fly anymore? This requires insight into the operation of the machine. Patterns in real-world applications can change and formerly well-performing AI systems degenerate silently.

There are a few reasons why exclusively relying on the performance of an AI system to trust it may not be enough:

  1. When testing is supposed to be done in the real world this might, depending on the stakes involved, pose an unacceptable risk.
  2. When testing has been done in a lab, do users understand the significance of the metrics well enough to make informed decisions? If the system’s prediction is wrong, how wrong will it be? It is the users of an AI system that are accountable for decisions made — not the system’s creators.
  3. Measurements from a lab environment might not translate to the real world at all when the algorithm has learned shortcuts based on biases in the training data. This would lead to impressive performance in the lab that is not reproducible in the field.
    Ribeiro et al. (2016) describe an experiment in which a model is trained to distinguish huskies (Eskimo dogs) and wolves with high accuracy. Curiously, the researchers can show that the model ignored color, pose or other attributes of the animal itself but made its prediction based on the presence of snow in the background. This model would barely be usable in practice. (see image below)
  4. Patterns a model has learned might not be stable in the real world but shift over time such that predictions gradually worsen.
  5. Applying model predictions in practice can change the environment such that the assumptions on which the model makes its predictions do not longer hold.
    Caruana et al. (2015) trained a model to predict the probability of death from pneumonia to decide whether to hospitalize patients. On inspection, they found that the model, counter-intuitively, predicted patients with a precondition of asthma to have a lowered risk of dying. This is explained by the fact that doctors typically not only hospitalize those patients but admit them directly to the intensive care unit. The aggressive care administered lowers the risk of pneumonia patients with a history of asthma below average compared to the general population. The effect of following the model’s prediction without understanding this mechanism would keep patients with preconditions from being hospitalized subjecting them to an unacceptable risk.
Classification of images as either husky dogs or wolves — Photo sources: Gabe Rebra, Christian Bowen, Monika Stawowy, Simon Rae, amanda panda, Milo Weiler, Robson Hatsukami Morgan, Mariah Krafft (all on Unsplash); Composition and graphics by the author

Kozyrkov still clearly makes some valid points in her article — e.g., there are limits to human comprehensibility. We invented complex algorithms to solve complex problems — problems too complex to be solved by simple means. Humans are neither capable to visualize high numbers of dimensions nor to grasp highly non-linear relationships, which are both characteristics of typical “AI problems”. This means that explanations must necessarily simplify. Kozyrkov points out that while we cannot inspect the workings of every neuron in a human’s brain we still trust other people. However, humans are not complete black boxes. They consistently produce useful explanations for their ways of thinking and their behavior.

Consider that the models we use for predictions are also only approximations of the complexity of reality. We still find them to be useful. Thus, with few exceptions, we should expect explanations to be useful approximations of the complexity of our AI systems.

Reframing the Accuracy-Comprehensibility Trade-off

There seems to be a consensus that there exists a general trade-off between the accuracy of a model and its comprehensibility: the better a model is at predicting the less understandable — both due to its higher complexity — it becomes for humans and vice versa. We might conclude that the higher the stakes the more accurate a prediction we need, and thus unexplainable predictions will be unavoidable. This seems a dilemma as for high-stakes decision we also want to deeply understand all factors playing into them.

We need to clarify and reframe this perspective: The only thing any machine learning algorithm is capable of is making predictions. Yes, even generating a sentence technically means to predict the next word based on the previous ones. A classification is a prediction of what label would be assigned by a human annotator, etc. But even highly accurate predictions are rarely useful by themselves. They need to inform decisions to implement actions. Decisions are based on predictions but apply context and evaluation of consequences and their probabilities. Otherwise, we would have to assume that every two decision-makers would come to the same decision given the same prediction, which is clearly not true. To help decision making AI systems must supply the context of their predictions. Comprehensible AI gives this context.

That does not mean we should prefer complete transparency over prediction accuracy in all cases. But starting from an ultimate user goal and its prerequisites will help to make an educated guess on what needs to be explained and how accurate predictions must be. Instead of thinking in binaries as “black box” vs. “white box” (aka “glass box”) we might want to aim for “grey boxes” (Broniatowski, 2021) that allow for the right balance between comprehensibility and prediction accuracy.

In any case, comprehensibility must not be an afterthought — after all technological decisions have been made — but must be an integral part of product concepts and design in close collaboration with the end-users of an AI system.

In the following articles I will go deeper into the issue of trust in AI and how the inner workings of AI systems can be explained.


Peeking into the Black Box — Trust in AI — Part 2

Henner has a background in design and computer science and loves to think and speculate about AI futures and emergent technologies. He also creates digital products

Follow on Medium!
Connect on


  1. Broniatowski D A (2021). ‘Psychological Foundations of Explainability and Interpretability in Artificial Intelligence’, NIST: National Institute of Standards and Technology, U.S. Department of Commerce.
  2. Caruana R, Lou Y, Gehrke J, Koch P, Sturm M, Elhadad N (2015). ‘Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 0-day Readmission’, KDD ’15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1721–1730, Association for Computing Machinery (ACM).
  3. Hinze H (2020), ‘Reply Hazy. Try Again.’, Medium [online], accessible at: (Accessed: 12 August 2021)
  4. Kozyrkov C (2018), ‘Explainable AI won’t deliver. Here’s why.’, Hacker Noon [online], Accessible at: (Accessed: 22 June 2021)
  5. Lee J, Moray N (1992). ‘Trust, control strategies and allocation of function in human-machine systems’, ERGONOMICS, vol 35, no 10, pp 1243–270, Taylor & Francis Ltd.
  6. Müller B (2021). ‘Ghost in the Machine: Designing Interfaces for Machine Learning Features’, [Online]. Accessible at:, (last accessed: 27 July 2021).
  7. O’Neil C (2017). ‘Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy’, Penguin Random House.
  8. Ribeiro M T, Singh S, Guestrin C (2016). ‘”Why Should I Trust You?” Explaining the Predictions of any Classifier’, KDD ’16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1135–1144, Association for Computing Machinery.

Further reading

  1. Allen P (2018), ‘Prototyping Ways of Prototyping AI’, Interactions: The HCI Innovator’s Dilemma — Special Topic: Designing AI, vol XXV.6, iss November–December 2018, pp 47–51, ACM.
  2. Churchill E F, van Allen P, Kuniavsky M (2018). ‘Designing AI’, Interactions: The HCI Innovator’s Dilemma — Special Topic: Designing AI, vol XXV.6, iss November–December 2018, pp 35–37, ACM.
  3. Cramer H, Garcia-Gathright J, Springer A, Reddy S (2018). ‘Assessing and Addressing Algorithmic Bias in Practice’, Interactions: The HCI Innovator’s Dilemma — Special Topic: Designing AI, vol XXV.6, iss November–December 2018, pp 59–63, ACM.
  4. Kahnemann D, Tversky A (1974). ‘Judgement under Uncertainty: Heuristics and Biases’, Science, vol 185, iss 4157, pp 1124–1131, American Association for the Advancement of Science.
  5. Lindvall M, Molin J, Löwgren J (2018), ‘From Machine Learning to Machine Teaching: The Importance of UX’, Interactions: The HCI Innovator’s Dilemma — Special Topic: Designing AI, vol XXV.6, iss November–December 2018, pp 53–37, ACM.
  6. Martelaro N, Ju W (2018), ‘Cybernetics and the Design of the User Experience of AI Systems’, Interactions: The HCI Innovator’s Dilemma — Special Topic: Designing AI, vol XXV.6, iss November–December 2018, pp 38–41, ACM.
  7. Wong J S (2018), ‘Design and Fiction: Imagining Civic AI’, Interactions: The HCI Innovator’s Dilemma — Special Topic: Designing AI, vol XXV.6, iss November–December 2018, pp 42–45, ACM



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Henner Hinze

Henner Hinze

Speculator, thinker, and curious wonderer about futures and the consequences of AI. I also create digital products.