It’s Our Fault That Computers Are Racist

Michael Holmes
Digital Diplomacy
Published in
6 min readJul 29, 2020
Photo by EFF Photos on Wikimedia Commons

In 2015, Jacky Alciné, a software engineer in Brooklyn, noticed that his Google Photos account had auto-generated an album titled “Gorillas.” Inside, he found pictures of himself and a friend, incorrectly identified as primates by Google’s facial recognition software. After Alciné posted a screenshot on Twitter, Google’s chief architect of social responded via retweet, apologizing for the mistake. A research team was dispatched to examine the data. In the end, they determined that the problem was not a malicious occurrence but a symptom of a still-developing technology. One employee pointed out other recent cases in which Google Photos had tagged some white faces as dogs and seals.

But here’s the thing: computers might not actually be racist, but they are stupid. Like, really stupid. They may have vast quantities of processing power, but they only know what they are told. And when it comes to machine learning algorithms, this can be a huge problem.

A facial recognition algorithm is a lot like someone painting a portrait. Just as an artist uses the end of a paintbrush to gauge the distance between tear ducts or find the angle between the pupils and the corner of the mouth, facial recognition programs convert the human face into a series of measurements, referred to as “biometrics.” At its core, the technology is that simple: a picture of someone is fed into the computer, the computer models the person’s features with biometrics, and then compares that set of measurements to a library of data in order to find a match.

Biometric Facial Recognition at Houston International Airport, U.S. Customs and Border Protection

But before a computer can start mapping facial features, it needs to be able to know what parts of an image are face and what parts are not. In order to teach the software, programmers feed the computer a set of “training data” —an album of many, many faces. When the computer later goes to perform facial recognition tasks, it uses what it has learned from the training data as a basis for its decisions.

As a result, the more similarities that photos being identified share with the photos in the training dataset, the better the algorithm performs. Unfortunately, if a facial recognition system is going to operate in the world, the computer needs to be able to handle tons of variation in lighting, angle, and quality. A surveillance photo from a convenience store robbery, for instance, is very different from the stark lighting and controlled environment of a photo in a mugshot database.

MIT Media Lab, Madcoverboy

One of the biggest issues concerning training data is race. In a 2018 study, researchers from MIT and Stanford tested the facial recognition systems developed by Microsoft, IBM, and Megvii. The study examined how well each algorithm could guess the genders of more than 1,200 subjects. In order to ensure a wide range of skin tones, the dataset of subjects was pulled from three African countries and three Nordic countries. The researchers found that all three of the face recognition programs misidentified women of color the most (error rates ranged from 21 to 35 percent). For white, male subjects, however, all error rates were lower than one percent.

When looking at training data, the researchers found that one “major U.S. technology company” trained its software on a dataset that was more than 83 percent white and more than 77 percent male.

Photo by Quick PS on Unsplash

Other systems have encountered similar issues. In 2018, the ACLU tested Amazon’s facial recognition program, Rekognition, using pictures of U.S. lawmakers. When checking for matches against a mugshot database, Rekognition incorrectly identified 28 members of Congress as people who had been arrested for crimes. The misidentification was worse for those with darker complexions, with a 39 percent error rate (despite people of color making up only 20 percent of Congress).

The most recent and high-profile study on facial recognition accuracy was released in late 2019, by the National Institute of Standards and Technology. The researchers examined 189 different algorithms, voluntarily submitted by 99 developers. The algorithms were given a dataset of more than 18 million pictures. Shockingly, the researchers found that Asians and Blacks were up to 100 times more likely to be misidentified than white men. Native Americans, meanwhile, had the highest false-positive rate (where one person is incorrectly identified as another).

The NIST study also found that biases existed across a variety of search types. In one-to-many searches, which compare a single image to a large database in order to find a match, black women were commonly misidentified. This disparity is alarming, considering the fact that one-to-many searches are most often employed by police investigators looking for a suspect. In one-to-one matching, the kind used for unlocking phones or checking a passport, Asians, Blacks, and Native Americans all suffered from higher false-positive rates.

Algorithms developed in Asian countries, however, had a much lower difference between White and Asian error rates, suggesting that racial distribution in training data may indeed be a factor in resolving error rate disparities.

More than an indicator that the technology might not be ready for widespread implementation, facial recognition’s high error rate for faces of color is an example of how the biases within society become concrete, systemic disadvantages. Take the issue of training data representation, for instance. In many cases, well-lit, high quality albums of people of color aren’t as readily available. From the earliest days of color pictures, photo technology has been optimized for pale complexions. Even today, cell phone light sensors and digital cameras struggle to capture dark skin tones in a variety of conditions.

Photo by Marc Mueller on Unsplash

We like to believe that while technology has the potential to exacerbate societal issues like racism and misogyny, it is a problem of implementation. But computers only know what they are told, and when they are designed to prioritize certain demographics, that’s exactly what they’ll do. If we fail to examine our technology against the social context in which it is developed, we can mistake human prejudice for scientific fact. We start to believe that cameras and photo tagging software don’t need to improve, it’s just that black people are not photogenic. It’s not that training data is too homogenous, it’s that most people of color look alike. And that is a dangerous road to go down.

Two years after the gorilla debacle, Google Photos finally “fixed” its tagging problem — by completely removing the tags “gorilla,” “chimpanzee,” and “monkey” from the platform. Mission…accomplished?

Photo by yarne fiten on Unsplash

--

--