How biased AI can teach us to be better humans

Hanne-Torill Mevik
Bakken & Bæck
Published in
13 min readFeb 27, 2018


Photo by Andy Kelly on Unsplash

Let me tell you, now is an exciting time to be a data scientist with everyone and their grandmothers jumping on the AI bandwagon, all wanting an AI to fix all the things. I am immensely enjoying the new sexy status of the historically nerdy field of Machine Learning. In case you’re not familiar with that term, you can thank PR and marketing people for somehow managing to re-brand the messy intersection where Statistics, Math and Computer Science meet and merge into the much hyped term “Artificial Intelligence”.

Now, before you head on out there, eager to build your own chatbot and release it into the wild, there are a few lessons to take to heart. Aside from the feasibility and the usefulness of the tasks we can automate today with machine learning (that’s a whole post in itself!) there’s another, more sinister, side to the story. I will share with you a cautionary tale of machine bias and how it already is affecting you today and why you should care that mitigating measures are taken to either prevent it when possible or lessen its effect. If you make it to the end of this post, you will gain an understanding of why de-biasing is such a difficult task.

Firstly, we should agree on what bias exactly is. Bias can mean many things, but in the scope of this post we turn to Wikipedia for two specific definitions. Bias exhibited by humans is often called Cognitive bias, meaning

a repeating or basic misstep in thinking, assessing, recollecting, or other cognitive processes.

When the outcome of such an illogical reasoning is negative, this type of bias is often referred to as prejudice.

The honest truth is that what we currently call “AI” isn’t capable of actual cognition of any kind, and the missteps in an AI’s programming logic arise from buggy code written by an overworked programmer. Rather, the bias we observe in machines is caused by

systematically favouring some outcomes over others, or by using data consisting of an unfair sampling in the population.

A hypothesis

I propose the following hypothesis, that
1. Unfair data is the quantitative expression of cognitive and societal bias, and
2. Machine bias is a product of unfair data.

It follows that unmitigated or unchecked machine bias can perpetuate societal prejudice, but also that machine bias has the potential to be a tool to uncover biases and prejudices we are unaware of.

In this post I will give examples that support my hypothesis and argue why you should care. Most of these examples originate in the U.S., where industry giants like Google, Facebook, Amazon, Microsoft, Apple and Baidu are drawing on near unlimited resources and pushing the envelope of research and development and the adoption of Machine Learning into products.

Releasing your AI thinking it is the shit

In Norway, and the Nordic countries in general, we are lagging a fair bit behind when it comes to incorporating machine learning into educational programs and commercial products. Now is therefore an opportune time to learn from the mistakes of others and avoid the common pitfalls.

The pitfallacy — What’s in a “data”?

Let’s start with an association game around the word “data”. Some of you will think of numbers and statistics, something that is well defined, well known, static and immutable, impartial, fair or unfair in its inability to adjust to mitigating circumstances. Others among you will think of Star Trek’s Data, as many of the words I listed are also counted among that character’s personality traits.

Our assumption is that if we feed our believed-to-be impartial data into a machine learning algorithm — in order to train a model to solve some task for us — then this resulting model will be equally as impartial and rational and fair as the input data.

The fallacy is that data is not at all impartial! It is equally as biased and prejudiced as us humans who generated it. You can be sure that all the -ism’s are represented: Sexism. Ageism. Racism. Ableism. Heterosexism. Nativism. Lookism. (Yup, I had to look some of them up.) When the data is skewed towards some feature, the resulting machine learned model will be equally skewed.

Research shows that when applying standard and widely used Natural Language Processing tools to our written languages, machine learning algorithms will record and display the same biases that humans demonstrate in psychological studies. This happens because semantics, the meaning of words, necessarily reflect regularities latent in our cultures, some of which we know to be prejudiced.

Bias embedded in the data

Take gender stereotypes found in written texts as an example. Our language is the lens through which we learn to understand the world, its populations and their stories. We learn that the word “man” in both Norwegian and English, can refer to a male individual or “all of humanity”, but the word “woman” never does. The narrative in my math and physics books from high school and university was always male, using a “he” to either describe a man, someone in general or the reader, and never a “she”. (In recent years an effort is being made to use a more equally gendered language.) It is therefore no coincidence that the machine learned models — which are trained on texts and articles scraped from Twitter, Wikipedia, books and news articles — will learn these same gendered semantic patterns that exist in our historical and current ways of expression.

The Word2Vec algorithms published in 2013 quickly achieved popularity for combining good performance with a fairly understandable representation of words as numbers. The models learn the linguistic context of words and their semantic relationship in an unsupervised fashion, meaning that they receive no instructions, nor do they have any prior knowledge about the language.

One fun property of these language models is that we can query them for linguistic analogies and similarities. We can ask questions like: “Paris is to France, what Oslo is to <blank>” and receive the answer Norway.

Illustration of how Word2Vec learns semantic relationships between different types of words. Source:

We also find that “she” is to “nurse” what “he” is to “doctor”. And that “female” and “woman” are more closely associated with arts and humanities occupations and with the home, while “male” and “man” are closer related to math and engineering professions. Or that pleasant words like “gift” or “happy” are placed in context with white American names, while black American names are more associated with unpleasant words. This shouldn’t come as a surprise since there is already plenty of public debate about bias concerning foreign or ethnic sounding names, as exemplified by two, all too common and unfortunate practices, in which job applicants won’t get called into interviews or house seekers have a difficult time renting an apartment, both due to having Arabic sounding names. These are a few examples of how the bias embedded in the data leads to machine bias: that biased semantic patterns in our language are expressed in a learned mathematical model.

Bias because of missing data

On the opposite end of the spectrum, machine bias in Computer Vision and object detection in images commonly arise due to a lack of data, as the dataset of images used to train the model does not include all possible variations of the object you want it to detect. Evidence of this is found in some quite cringe-inducing examples. For instance, there was the time Google Photos auto-tagged black people as gorillas, with Flickr following suit. And then Nikon’s Face Recognition software couldn’t deal with Asian features. Machine bias due to lack of varied data is not race-exclusive, but can also include ageism. How about the app that tries to classify beauty, that associates only images of young, white people with beautiful faces? Or the one that attempts to predict age based on blood tests, failing miserably when used on an older population?

To be more specific, the racist bias exhibited by Google Photos, occurred because the algorithm was shown only faces with similar, lighter skin colour during the training of the object detection model. In addition to learning to recognise edges and colour changes that indicate eyes, mouths, nostrils and teeth, the model also learned from that unvaried data set, that a light skin color is an equally important feature for accurately detecting human faces. If, instead, they had added training data containing all possible variations of skin tones across the entire human populace, the model would have learned to put less emphasis on the skin, and more emphasis on the presence of two eyes, one mouth, nose, chin and forehead.

Actually, this is the main takeaway lesson from this post:
If your intuition is telling you that there is a feature or property in your data that shouldn’t be as important as others: then either remove it entirely, if you can, or make sure to include it in its entire spectrum of possible values to lessen its effect in the learned model.

Bias from reading between the lines

What if your data is neither unstructured texts nor images, but rather structured textual data in the form of personal information, be it name, gender, age, address, income, educational background, profession, sexual preference, political affiliation? Please pause for a moment here and appreciate just how much personal information you have disclosed to for-profit companies, like your bank, phone service provider, insurance company, Facebook, LinkedIn, grocery store loyalty programs. The list goes on. All of these companies have incentives to build recommendation systems based on your data (or to sell them to someone who does) to create or offer services specifically tailored to suit you. The stakes of certain systems are higher than others; arguably your life will be more impacted by a declined house loan, than say, a poor book suggestion on Amazon Kindle.

One infamous and alarming example of a machine learned prediction system used in high-stakes decision making comes from the American criminal courts. A commercial software called COMPAS was employed by judges to predict how likely a person charged with a crime was to re-offend at a later time. The calculated risk score was used to determine the sentencing, which could mean offering the defendant a plea deal or a prison sentence of several years. An extensive analysis was performed by ProPublica, who found that the risk assessment was heavily biased against the African-American community.

Source: ProPublica. Examples of risk scoring of a white male and a black, young adult female.

The article explains that the practice of using race, nationality and skin colour making recidivism predictions ended in the 1970’s, when it became politically unacceptable. So where did the machine bias come from?

COMPAS calculated a score derived from 137 questions that were either answered by defendants or pulled from criminal records. Race was not one of the questions. However, the survey asked defendants questions such as: “Was one of your parents ever sent to jail or prison?”, “How many of your friends/acquaintances are taking drugs illegally?” and “How often did you get in fights while at school?”. The questionnaire further asked people to agree or disagree with statements like “A hungry person has a right to steal” and “If people make me angry or lose my temper, I can be dangerous”.

A validation study of the risk of recidivism score calculated by COMPAS showed an accuracy of 68 percent. But when questions and data items that could be correlated with race — such as poverty, joblessness and social marginalisation — where omitted from the data, the accuracy went down, to hardly better than a coin toss. It is easy to draw up a picture of one “rotten apple” in the fruit basket that is the white, working-middle class, who would answer no to the questions mentioned above, yet still be likely to commit new crimes later on. In contrast, considering that 39 percent of black American children and adolescents and 33 percent of Latino children and adolescents are living in poverty — compounded with unemployment rates twice that of white Americans — a high risk score would automatically be attributed a black or Latino individual simply by virtue of being born.

In the case of COMPAS, the failings are on multiple levels and not simply caused by using a machine learned model based on biased data. There is the misuse and appropriation of a tool for sentencing, that was originally intended to assist in selecting candidates for suitable rehabilitation programs. Secondly, there is the non-trivial challenge of asking questions that are neutral to socioeconomic status. And lastly, there is the problem of perpetuating social bias through unnecessarily harsh punishment of offenders, causing them to have problems with reintegration into society with a criminal record they didn’t deserve in the first place.

Is your name Acceptable?

Fortunately, the American criminal court is on the more extreme end of the spectrum, and most of us will hopefully not be the object of its attention. However, job matching is a much more innocuous high-stakes recommendation system. Recruiting agencies and head hunters are now automating the initial phase of matching profiles with job listings. Almost all of them use LinkedIn’s Recruiter tool during some part of the process, a tool which employs machine learning to recommend profiles to a job listing.


LinkedIn actually does not ask for information regarding gender and ethnicity. But, if you’ve managed to read this far, you now understand that textual data carries semantic meaning and relationships that can be learned by mathematical algorithms. Our names are gendered, our educational backgrounds and resident cities are correlated with ethnicity, school names are related to socioeconomic status, and previous work experience and job titles will be linked with the future possible matching with new job listings.

Other market places for job listings employ a click-based payment model, in which the amount of clicks on a listing directly affects earning potential. Their models exhibit a conflicting goals bias caused by the monetary incentive to show jobs that fit people’s self-view, thereby perpetuating societal bias. Women, for instance, are shown to be more likely to click on an ad for a position related to “nursing” rather than one that advertises for a “medical technician”.

The banking industry serves as another example of non-transparent, high-stakes decision making regarding housing loans. Several banks that operate in the Nordic market have already started to supplement, or even replace, the traditional credit score with machine learned models in order to decide whether to grant applications for housing loans. They use their internal client data to train a model that predicts how likely future clients are to keep up with mortgage payments. Surprisingly, some banks are actually experiencing an increase in the number of loans granted, so hopefully this practice turns out to be a win-win for all parties involved and not the beginning of the next credit crunch.

Bias-skepticism is for Everyone

There are several reasons for why you should care about Machine bias. You could be motivated by…

  • Idealism: you want to make the world a better and more inclusive place;
  • Professional interest: your company or product might not survive negative publicity;
  • Altruism: you are concerned about fair availability of opportunities and resource allocation;
  • Truth seeker: you want the truest, bestest data that can be;
  • Personal gain: you want access to opportunities, education, employment, health care

because eventually you will get a score, too.

Whichever category of motivation you fall into, in all likelihood you will be affected by machine bias in one way or another. It can be blatantly obvious, like the Facebook filter bubble, or it could be more insidious and hidden within job listings that are never offered to you, because your name fits a pattern of minorities.

Or we could flip the problem of machine bias on its head:

Perhaps the medical treatment that you need is not suggested to you, because treatment based on ethnic traits is considered to be a topic that is too difficult, too taboo and too politically dangerous for allowing medical research into. Perhaps machine learning techniques can be used to uncover systematic differences in drug and treatment efficacy within different ethnic groups in a more palatable way.

We can do better

As beings with natural intelligence we are equipped to be self-aware and consciously counteract our own learned cognitive biases, but the algorithms available to us presently are not incorporating bias-adjusting measures. Fortunately, gauging by the unprecedented amount of papers, keynotes and general chatter about bias and fairness at the Conference on Neural Information Processing Systems of 2017, much attention is now being directed towards developing de-biasing techniques and methods.

There are already some simple steps we can take to limit the effect of bias on our data. For instance, we can synthetically augment the data used to train our models. This is an established, non-controversial practice for generating sufficiently large and varied data sets that are suitable for learning. Essentially this involves taking samples from the real data, then performing small changes to the samples before adding these new, altered samples to the data set. For image data, the alteration can be adding noise, or shifting, scaling, rotating or flipping the image. Changes in skin tones and contrast settings could be a way to make the algorithm less sensitive to colour as a feature.

For text data, an analogous approach exists, where synthetic sentences can be added to the data by randomly replacing words with synonyms taken from a dictionary. One possible augmentation could be to add sentences in which male gender nouns are replaced with female, or Nordic-European sounding names are switched with names of different ethnic origins. Recently, mathematical techniques have been proposed to reduce the effect of gendered language by either adjusting the numbers before or after the prediction is made.

Finally, try to be brave and look your biased AI straight in its binary face. All of our collectively flawed humanity is looking back at you, and it’s not pretty. Review your company hiring practices (Check out the directory of women working in machine learning and Black in AI.) When the workforce is diverse, your colleagues or employees will bring perspectives and experiences that will help you check your bias before implementing and releasing that AI into the world.



Hanne-Torill Mevik
Bakken & Bæck

Theoretical Physicist turned Data Scientist, currently at Fürst MedLab in Oslo. I write and give talks about Machine Learning, technology and societal concerns.