To what extent can we trust Algorithmic Decisions

by Jaspreet Sandhu

At a time when humans are becoming more reliant on AI — speech recognition on our mobile phones, real-time speech translation, computer vision used to develop self-driving cars — trust and transparency have become key issues to be addressed as these systems are scaled and implemented on a global scale: their black-box nature means that we can’t entirely trust the validity of their outputs, especially with regard to the manner in which they arrived at their decisions.

Among the academic experts working on this issue is Adrian Weller, Programme Director for Artificial Intelligence at the Alan Turing Institute where he is also a Turing Fellow leading a group on Fairness, Transparency and Privacy. He is also a senior research fellow at the Leverhulme Centre for the Future of Intelligence where he leads work on Trust and Transparency. The article below discusses the issues he brought up during a talk on Trust and Transparency at the Data Science Summer School at Ecole Polytechnique, Paris.

Trusting Algorithmic decisions is a topic relevant not only to data scientists, but also to the public who will increasingly find themselves subject to its decisions, as well as managers of teams which are designing these systems. In this article, I will discuss the major points as he lays out the issue of trust around algorithmic black box decision making and the role that transparency can play in making these systems more reliable.

Our key: Valid and interpretable outputs
If humans are going to use AI systems, we need to be able to trust that their outputs are valid and interpretable: they also need to be reliable and work for our benefit. There are currently some problems with this, for instance:
  • Recommendations from online retailers: the system is incentivised to learn to appeal to our immediate instincts rather than what’s best for us/society in the long run.
  • Algorithmic stock-market trading: this provides a liquid market which works very well but, left to run their course, also gives us Flash Crashes (e.g. May 2010).

Another blow to reliability is the ease with which AI systems can be fooled by “adversarial examples”. For example, Neural Networks might be trained to correctly identify a picture of a panda, but when a small amount of random noise is added to the picture, the NN identifies the picture as a gibbon (Open AI Blog, 2017).

Figure 1. An example of adversarial example applied to GoogLeNet. Adding an imperceptably small and carefully selected vector to the initial image changes GoogLeNet’s predicted outcome to ‘Gibbon’ with very high certainty

More worryingly, in connection with self-driving cars, a NN trained to recognize traffic signs can be fooled by a simple piece of black and white tape on the sign: in the example below, the NN identified the sign as meaning “Speed limit 45”, so any self-driving car would have just continued driving (Kevin Eykholt, 2018).

Figure 2: Putting strips of black tape on a stop sign changed the output of the model from “Stop sign” to “Speed limit 45”

As a possible solution to this, Adrian suggested taking a probabilistic approach and modelling uncertainty: if an AI system knows what it doesn’t know with a minimum degree of certainty (“known unknowns”), it could switch to a pre-programmed safe fall-back mechanism, e.g. grinding to a halt.

Making AI systems more tranparent, reliable
AI systems are often uninterpretable black boxes — we don’t know how they get their results. Can we make them more transparent?

There are different types of transparency: for a developer, transparency helps to understand when a system is likely to work well/badly and improve it.

For users, transparency helps to understand how a decision was reached and which variables contributed to it and enables them to challenge it if they deem it unfair or unfounded, for example if an AI system turns down your loan application or is used in criminal sentencing.

For an expert, such as a regulator, transparency helps to understand what happened when something has gone wrong, and to assign liability.

One way of making AI systems more transparent is to devise “explanation models” to explain existing models. Adrian showed the picture below: in this example, a NN was presented with a picture of a husky which it incorrectly classified it as a wolf (Marco Tulio Ribeiro, 2016).

Figure 3: Example of a bad algorithm which misclassifies a husky as a wolf due to the snow in the background

The explanation model tells us that the computer arrived at its decision based on the presence of snow in the background. This is the symptom of a bias in the training data used to train the model with only pictures of wolfs containing snow in the background, resulting in a biased and inaccurate algorithm. This is obviously useful to know, since it allows us to adjust the input data to iron out that kind of interference.

Beware of superficial transparency

However, there still remains the risk of superficial transparency — humans have a tendency to be satisfied by any explanation, even if it’s misleading. A stunning example of this is the “Copy Machine Study”, conducted in 1978 by psychologist Ellen Langer at Harvard University (Langer, 1978). In that experiment, a researcher would spot someone waiting in a queue at a library photocopy machine and walk over with the intention of cutting in front of the (Marco Tulio Ribeiro, 2016) person. Then, the researcher would look at the innocent bystander and ask them one of three questions.

  • Version 1 (request only): “Excuse me, I have 5 pages. May I use the Xerox machine?”
  • Version 2 (request with a real reason): “Excuse me, I have 5 pages. May I use the Xerox machine, because I’m in a rush?”
  • Version 3 (request with a fake reason): “Excuse me, I have 5 pages. May I use the Xerox machine, because I have to make copies?”

Even though Version 3 didn’t make sense — “because I have to make copies” was not a good reason for jumping the queue because everyone in the queue needed to make copies — it performed as well as a genuine reason. When the researchers analyzed the data, they found these results:

Results of the Copy Machine Study (Langer, 1978)

The experiment suggests that while humans are more likely to trust and allow a request when an explanation is given, a nonsensical explanation seems to work as well as a valid explanation. In the AI context, this tendency could be abused by malicious agents or even by enterprises that could use unrelated reasons to get a user to consent to use or sale of their private data.

Transparency is not the holy grail !

Keeping in mind the nuances of consent we must also make sure that transparency does not become an end in itself — Is it acceptable for a simpler but more interpretable algorithm used in a self-driving car that results in 100,000 deaths per year or a highly complex but more accurate black box algorithm that results in 1,000 deaths a year?

One class of algorithms where transparency is crucial are those used in criminal sentencing in US. Given the extent of human impact of each decision taken by the algorithm, it is important to have results that can be understood, questioned and challenged. Bias is rampant even in commercial AI systems develops by giants in the industry: for example, AI image recognition systems from major technology companies consistently misclassify gender of dark women as compared to error rates for white men (MIT News, 2018), and facial recognition systems have been found to be much less accurate for non-white faces (Wired Inc., 2018).

As algorithms are deployed across different industries, especially those that significantly impact the human lives they make decisions for, like healthcare, credit ratings, recidivism, it becomes of crucial importance to ensure that the decisions are fair and made based on legal grounds as compared to discriminatory grounds of race/gender etc. We will discuss the legal interpretation of fair and non-discriminatory algorithms in our next post which outlines a talk by Mireille Hildebrandt, a professor in ‘Interfacing law and technology’ and the new (General Data Protection Regulation) GDPR regulations, which attempt to enforce this legal framework. Applicable to all EU residents from 25th May, 2018, GDPR defines and regulates policy on data storage, data protection and the subject’s ‘right to explanation‘ on automated decisions made about them by algorithms.

Works Cited
Ansaro. (2017, Oct 12). Retrieved from
Kevin Eykholt, I. E. (2018). Robust Physical-World Attacks on Deep Learning Visual Classification. CVPR. 
Langer, E. B. (1978). The mindlessness of Ostensibly Thoughtful Action: The Role of “Placebic” Information in Interpersonal Interaction. Journal of Personality and Social Psychology , 36(6), 635–642.
Marco Tulio Ribeiro, S. S. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135–1144 ). ACM.
MIT News. (2018, Feb 11). Retrieved from
Open AI Blog. (2017, Feb 24). Retrieved from
Wired Inc. (2018, March 29). How coders are fighting bias in facial recognition software. Retrieved from