Tackling the Concerns of AI: The Problem with our Data

Published in

AI4ALL

6 min readOct 30, 2019

Guest post by Anoushka Narayan, Stanford AI4ALL ‘19

AI4ALL Editor’s note: Meet Anoushka Narayan, a Stanford AI4ALL 2019 alum and Changemaker in AI. In this post, she discusses her interest in human-introduced bias in machine learning, and how this interest led to her involvement in AI4ALL’s summer program. She explains how over the course of the program she got to witness the inadvertent introduction of bias to datasets first-hand, and how meeting other girls like herself made her even more committed to ensuring their representation in the data and in the field.

Last fall, a wave of articles about Amazon crashed onto news feeds. Amazon wasn’t new to making headlines — the juggernaut of a company has over 300 million users (Kim, Business Insider) and is continually coming out with new services — but this time, the headlines read: “Amazon’s recruitment AI was retired after it was discovered to have an inherent bias against women.”

Bias comes from prejudice, and prejudice is an inherently human quality. So how does a machine learning algorithm develop gender bias? Surely, no one sat there and coded sexism into the hiring formula.

The real problem is with the data. Our human data.

Amazon’s hiring algorithm — which they retired when it was discovered to be biased — was trained to assess applicants by observing patterns in the applications submitted to the company over the past ten years. The majority of applicants were male, and of the people who were hired, an even higher percentage were male. Naturally, the algorithm perceived male applicants as favorable, and women as unfavorable. It went so far as to penalize the word “women(’s)” on resumes, discrediting achievements such as “president of women in engineering club.” (Dastin, Reuters)

Amazon was not alone. Google Photos labeled pictures of African Americans as those of gorillas. IBM’s gender-recognizing algorithms are amazing at identifying white men, but less so for dark-skinned women. The COMPAS algorithm, which predicts the likelihood of a criminal to re-offend, showed that black defendants are at a higher risk of recidivism than they actually are (Cossins, New Scientist). Clearly, there’s no particular company that is perpetrating this. It is a direct result of our inaccurate and incomplete data.

We are in the middle of a diversity crisis, and we’ve reached a point where if we continue the same way, our prejudices will be hard-coded into the very machines that we rely on for help.

When I read the article about Amazon last year, I grew curious. Curious about artificial intelligence and curious about why our data was so lacking where it needed to be full. It was this curiosity that brought me to Stanford AI4ALL 2019.

Anoushka (right) with another student at the Stanford AI4ALL 2019 Summer Program

Stanford AI4ALL was a game-changing experience on a personal level. Attending a competitive high school tricked me into believing that friendships were futile because your friends of today could be your biggest competition tomorrow. AI4ALL completely reversed that whole mentality.

For the first time, I truly bonded with the other girls — no strings attached. When we promised to keep in touch, I meant it. Everyone was incredibly passionate and intelligent, which only made it easier for me to find out what I came to find out. Why are our datasets so terrible and what can we do to remedy them?

A few days into the program, Dr. Olga Russakovsky, one of AI4ALL’s founders, traveled from Princeton University to share her insight on this very problem. After hearing some more alarming examples, Dr. Russakovsky told us about the work that her lab is doing to combat bias in machine learning algorithms. She reminded us that artificial intelligence is an interdisciplinary domain — it’s not a singular field of study, but instead a shift in the way we perceive technology. The way we moved from manual labor to automation is the same way we’ll move from automation to artificial intelligence. With that change comes the potential for a lot of collateral damage which could include drastic labor shifts and predisposed bias in the first AI systems. Easing the growing pains as we shift into the machine learning revolution would be an issue at the forefront of government policy and ethical debates.

If we want people to accept the technology, they first have to trust it. They have to realize that it is helpful, and not destroying the lives of people around them.

As eye-opening as Dr. Russakovsky’s lecture was, it seemed like something we would only encounter after college when we started working in the field. Little did we know that we would have an unplanned but personal encounter with the effects of skewed datasets.

As part of the summer program, we split into small research groups each tackling a different application of machine learning. Three out of the four groups had projects that hinged upon sifting through and classifying a dataset. One of these groups was the computer vision group. Their goal was to take a collection of satellite images and decide which ones showed areas of poverty so that the appropriate resources could be deployed.

A couple of days before the final presentation, the computer vision team discovered a significant flaw in their dataset. The types of images in each of their sets (training, validation, and testing) skewed heavily towards poverty, which could have erroneously amplified their accuracy percentage. For example, if 80% of a dataset is constituted of images representing impoverished areas, and an algorithm classifies every single image in the dataset as showing poverty, the computer has automatically achieved an accuracy of 80%. However, this does not reflect the true reliability of the algorithm because the algorithm may be labeling every image as poverty without actually analyzing each image. For the computer vision team, this meant that a percentage that should have been quite accurate may have been artificially inflated.

It took a lot of snacks and late nights for the team to work through it, but at the end of it all, they had a stupendous presentation and plenty of stories to share.

I am retelling this story because I think there’s a valuable lesson here. If eight high school students at a summer program dodged a landmine with their dataset, what about the datasets that companies are using? How are we validating the accuracy and completeness of our datasets? Are we willing to trust a system that boasts a 99.9% accuracy rate if we don’t know enough about the data that produced that number? AI will change the world. But to change the world, AI first has to understand it.

AI could help us eradicate seemingly intractable problems like global poverty or it could exacerbate the divide between the rich and the poor. It could find the cure for cancer or possibly unleash a global pandemic. Ultimately, as programmers, we get to decide what we strive for, and it all starts with having accurate and consummate sets of data.

About Anoushka

Anoushka Narayan is a sophomore at Westwood High School in Austin, Texas. Pursuing her interest in physics and computer science, she attended the Stanford AI4ALL program in 2019. Anoushka is currently working on an independent passion project in which she hopes to create a portal that will connect entrepreneurs with investors regardless of geographic barriers. She also enjoys writing here on Medium. You can find more of her articles on her profile: Nara

Tackling the Concerns of AI: The Problem with our Data

About Anoushka

Written by AI4ALL Team