Written by KristineDotTech

Bias in AI

KristineDotTech
ArtInUX
Published in
11 min readJan 15, 2021

--

The paradigm of (un)ethical futures.

Ask anyone what they think artificial intelligence is and they’ll have different opinions. Some, like Jeff Bezos, will sing its praises for humanity’s digital future, while others, like Elon Musk, will see it as a harbinger of doom. And those that have spent the last 5–10 years under a rock might dismiss it as a fad and yet another buzzword.

The truth is that AI is at a turning point in the field of technology. The presence of AI is ever-increasing, and we may have passed a point of no return, where this technology becomes an expected part of life. Ever imagine your inbox without a spam filter? I didn’t think so.

Image by sub-comics.com

Indeed, some valid questions are stemming from concerns arising out of AI technology. For instance, is Elon right? Will AI pose a threat to the human race? The answer, as of writing this article, is “no”. Experts believe that this won’t change much for at least another 50 years or so as AI has yet to rise above human intellect. In her book “Hello World” Dr Hanna Fry compares the modern AI intelligence to a hedgehog.

Since we are growing ever dependent on AI systems, we also need to ask a very compelling question: Can we trust AI judgment? The answer is, — not yet, and certainly not blindly. AI technology suffers from something called bias, that has its foundation in human prejudices introduced in training data.

Bias?

Bias can be defined as “a disproportionate weight in favour of or against an idea or thing. In science and engineering, a bias is a systematic error.” Whether we like to admit it or not, we all have biases. It is a lack of awareness (or willful ignorance?) of these biases that often introduces issues when it comes to designing AI solutions and selecting training data.

“Yes, but that’s human. AI is not human” I hear you proclaim. That is true. To understand how this affects the models, lets briefly unpack how human bias creeps into your otherwise friendly AI. To illustrate this at a high level, I drew a comic.

comic by me

As I hope I have been able to illustrate, the bias in AI is the covert prejudice in the data used to train algorithms (a set of instructions formulated so that a task can be performed). This can result in discrimination and have other social reactions when an AI tool is used. This can be demonstrated with a simple example. Supposing an algorithm has to be created to decide whether an applicant gets accepted into a university or not. Let's imagine that one input is location. Now let’s speculate that the location may correspond to ethnicity. In this case, the algorithm would favour, even though indirectly, certain ethnicities over others, which would lead to bias for one ethnic group over another.

Now. Back to humans and bias.

Cognitive Bias

Cognitive bias is an area of the human psyche that has been extensively studied to explain the prejudicial nature of human behaviour. A cognitive bias is a lack of objective thinking, and for a human brain, it’s a shortcut in information processing. This is because the human brain tends to perceive and process information based on personal experiences and preferences. The information received goes through a kind of filtration, and only that which is familiar or personally relevant gets processed.

There are types of cognitive bias, such as the “bandwagon effect”, selective perception, priming and confirmation bias, that all play a role in bias in AI.

  • The Bandwagon Effect- This is how the brain concludes that something/someone is valuable because others (or a majority) value it.
  • Confirmation bias — This is the brain’s tendency to use new information that matches or confirms/validates existing notions.

An example of confirmation bias is when you disagreed with someone, googled your side of the argument and selected the result that confirms what you think as proof of you being right while disregarding all other results that may prove you are wrong.

There is a lot to be said about the social media sites and giants like google.com reinforcing people’s confirmation bias through their algorithms, but more on that later. If we acknowledge and understand that this exists and is a large part of the way we humans think, we can get a grip on AI bias.

Data analysis uses objective tools for machine learning and making purely data-driven decisions. Nonetheless, it is humans who select the data to be analysed. Here is where the bias creeps into the final product.

These biases are unintentional (one would hope!) and can be called “human error” in some cases. Regardless, the presence of bias in machine learning systems can be quite significant, depending on the use of such systems. Biases can often lead to lower quality of customer service, reduced revenue and sales, illegal actions, and dangerous conditions. In industry and commerce, especially, such scenarios should be prevented. Organisations are required to keep a close eye on the data used to teach algorithms for cognitive bias as well as for (rigorous!) validation. Bias in AI can only be limited in this way.

The efficacy of data scientists, analysts and designers

Artificial intelligence can only be rid of inherent bias provided data scientists act in an efficacious, ethical and objective way, alerting themselves to possible bias. Data should represent a range of races, backgrounds, genders and cultures that could be adversely affected by prejudicial issues. Data scientists, analysts and, in some rare but increasingly more popular cases, UX designers play a part in developing these models. It’s the responsibility of this team to mould data samples in a way that deliberately decreases bias from faulty machine learning models. Decision-makers must evaluate the appropriateness of applying machine learning technology in all aspects of AI before it is in production. Have you (in)validated your algorithm with large enough dataset that you are absolutely sure it is bias-free?

The danger of bias in AI

Any discrimination is derogatory and undermines equal opportunity. As a result of this, often unintended outcomes may lead to oppression. This has been proven many times in real life, throughout history, and, as you will see in some examples I have collected, continues to be the case in everyday life. Wiping out this prejudice in humanity has been a Herculean task, let alone in machine learning. In AI, the task has to be simpler, but it means eliminating the prejudice at the root of when algorithms and models are written. Anomalies that result in bias occur when algorithms produce results that arise out of faulty assumptions in the process of data analytics and machine learning.

Examples of bias in AI

Making non-biased algorithms is hard. Here are a few real-life examples of instances of bias:

Bias in American healthcare

In 2019, an algorithm was used on more than 200 million Americans in US hospitals, as a predictor of patients who would need specially trained nursing staff and extra medical attention. The algorithm was not directly related to racial factors, but historical data on spending on healthcare was used as a proxy for medical needs. The rationale behind the algorithm was that cost indicated how much healthcare a person required — the more someone spent — the more care they needed. Since, on average, white patients incurred more costs, it was concluded they needed more care. Given the known factors of income inequality amongst minority groups in America, this data was already biased towards the wealthier groups. In addition, the dataset used to train the algorithm was not balanced — with white patients’ documents outnumbering those of black patients by a ratio of 7:1. This resulted in clear-cut healthcare discrimination.

Bias in America’s criminal justice system

Image: ProPublica

Data in the criminal justice system has been gathered and used for making decisions for nearly 100 years. So it’s no surprise that an AI would be used in this sphere to assist judges in making consistent and just sentencing decisions. A criminal risk assessment algorithm was used to predict the likelihood of a defendant becoming a recidivist. The model predicted that there were twice as many false positives for recidivism in black offenders and twice as many false negatives in white offenders. This means that the algorithm would predict (and therefore likely to influence the judges’ decision) that the person before them are twice as likely to re-offend if they were black than if they were white. The flip-side of this coin is that the white offenders that are likely to re-offend were twice as likely to be marked as ‘low risk’ and receive lighter sentencing as their black counterparts.

These algorithms use statistics to find patterns and correlations in data. But correlation causation. When the algorithm is trained on historical data, especially if this data is loaded with stereotypical bias, what gets designed is a stereotype reinforcement machine that replicates errors of the past.

Gender bias in recruitment

image www.seattletimes.com

In 2016 LinkedIn got hot under the collar as it was discovered to be suggesting male variations of female names in search queries, but not vice versa. For example, “Stephanie” would bring up a prompt asking if users meant “Stephen”, but queries for “Stephen” did not suggest “Stephanie”. The company said this was the result of an analysis of users’ interactions with the site.

Bias in biometric data sets

Image by Joz Wang

Rather embarrassing for Nikon back in 2010, Nikon’s cameras used image-recognition algorithms biased towards (read that as “trained on”) Caucasian faces consistently asked Asian users if they were blinking. Nikon was not the only giant who found themselves red-faces. A few years later, in 2015, Google received complaints about Google Photos image-recognition algorithm when users with darker skin tones were being identified as gorillas.

Bias in artistic expressions

AI-based art generation is programmed in a number of apps that let you repaint pictures in styles of the great masters to reproducing wood engravings. In an obvious example of bias, researchers found flaws in an app called “Abacus”. Young men with long hairstyles were mistaken for women in paintings by Raphael and Piero di Cosimo. Researchers who called out the bias decisively believed that the output was influenced by data analysts’ preferences and usage of Renaissance paintings of primarily white, female subjects as training data. (And if you want to explore more on AI in art, we’ve covered it here.)

Bias and Social Media Platforms

There are several examples of AI bias we see in today’s social media platforms. Data from tech platforms is used to train machine learning systems, so biases lead to machine learning models’ prejudice.

In 2019 Facebook permitted advertisers to match ads according to race, religion and gender intentionally. For example, women were displayed employment ads for nursing roles or secretarial jobs. On the other hand, men were being steered in the direction of job ads for janitors and taxi drivers. Particularly, men from minority backgrounds were shown these. This is a bias on more than one level. Since this was brought to light, Facebook doesn’t permit employers to specify gender, age or race priming in its ads anymore.

Bias Fixes

I won’t offer you a silver bullet. If I had it, I’d probably fix this problem myself. But what I would like to share is some ideas that emerged in my research, some of them are more obvious than others.

  • Ensure your follow the 4V’s of data. Volume & Variety — ensure you have a large enough sample size the brings with it enough variety, from a variety of sources and in a variety of formats. Volume and Variety ensure you represent your population as close to the real picture as you’re able. Velocity & Veracity — These refer to both data quality and availability. Velocity is the measure of how fast the data is coming in, whereas data veracity represents data accuracy and precision, informing you to what degree it can be trusted.
  • The efficacy of data should be controlled by an official authority, someone who has skin in the game. Social responsibility rests with companies as well as with individuals. Regulations must be in place to see that machine learning processes are ethically managed. For instance, internal compliance teams may need to be tasked to oversee an audit of each developed algorithm.
  • Ensure your data is current and timely. Training your algorithm on the reality of 70s will not help you solve the problems of today. Data should be developed with the future outcome in mind, and if it must challenge the status quo, then that needs to be given enough thought. It can be expected that some biases will creep into data that is randomly sampled. Bias is a fact of life. Nonetheless, there is a need to proactively ensure that the data used to train algorithms of tomorrow represents people equally and without prejudice.
  • Have a strategy to check for your biases within the team. If a team working on training the algorithm comes from a homogenous background, have different team assess your training data and outputs, have different people contribute their views to ensure your blindspots are checked.
  • Any AI tool that is put into production needs to be continuously monitored to ensure it is continuously bias-free, fair and representative.

In conclusion

Garbage in, garbage out

In isolation, AI is not inherently biased or evil. But AI is built by people, and people, as well as the data they use, can be implicitly or explicitly biased. Responsibility is, therefore, with the humans, not the machines.

Machine learning — a sub-area of artificial intelligence — is dependent on the objectivity, quality and quantity of training data set used to teach it, and validation data set used to validate or, more importantly, invalidate the model. Erroneous, incomplete or inadequate data will invariably lead to inaccurate predictions.

There is no doubt that the people responsible for creating these and other algorithms that unintentionally discriminated against some groups of people, were made in good faith and with good intentions. But these are just a few examples that highlight just how badly we need a regulatory body and independent experts to police the algorithm-driven future. There needs to be adequate training on bias and ethics for people responsible for designing these algorithms, to ensure that these algorithms are appropriate and minimise the harm they can cause to the society at large.

Sources & further reading

https://en.wikipedia.org/wiki/Bias

http://www.hannahfry.co.uk/books

https://www.scientificamerican.com/article/racial-bias-found-in-a-major-health-care-risk-algorithm/

https://www.technologyreview.com/2019/01/21/137783/algorithms-criminal-justice-ai/

https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

https://www.theatlantic.com/ideas/archive/2019/06/should-we-be-afraid-of-ai-in-the-criminal-justice-system/592084/

https://www.propublica.org/article/what-algorithmic-injustice-looks-like-in-real-life

https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

https://en.wikipedia.org/wiki/Ernest_Burgess#Burgess_method_of_unit-weighted_regression

http://content.time.com/time/business/article/0,8599,1954643,00.html

https://eu.usatoday.com/story/tech/2015/07/01/google-apologizes-after-photos-identify-black-people-as-gorillas/29567465/

https://www.seattletimes.com/business/microsoft/how-linkedins-search-engine-may-reflect-a-bias/

https://www.bbc.com/news/technology-45569227

https://www.ibmbigdatahub.com/infographic/four-vs-big-data

--

--

KristineDotTech
ArtInUX

UX Researcher ex Product Manager. Exploring topics about data viz, UX, AIX, MLUX | Views are my own.