Fixing Racist Facial Recognition in 5 minutes

Ben Taylor
4 min readMar 5, 2018

--

Recently the New York times ran a story on racist AI:

The general public gasped and the AI community rolled their eyes [not because they don’t care, but because they know why this is happening]. Personally, I think the traction this story gained is good because it builds awareness for those who aren’t thinking about this. The reality in AI is if you have a dataset trained without proper minority representation it will not be able to predict that minority well. The New York times referenced an effort from the group gendershades.org where they created a dataset from Sweden, Finland, and Iceland for whites and Rwanda, Senegal, and South Africa for black. They then used that dataset to test commercial software for gender accuracy. The AI community members that rolled their eyes should be very surprised to see that the error rate for black females dropped below 35%! Yikes!

https://www.nytimes.com/2018/02/09/technology/facial-recognition-race-artificial-intelligence.html

For someone who has experience shipping gender models anything under 95% accuracy is terrible. A 35% error makes me wonder if the training set had ANY black representation. The study the New York Times referenced from GenderShades had access to 1,270 images. The benefit of being a company and not an academic researcher is I have access to a little more data. I have access to 50,568,004 images of faces. I even have the luxury of selecting out the same country mix:

Great, now that I have these images I want to build 2 deep learning models. To do this I just have to organize the data in Excel. The first column points to the image locations (url or local path) and to the right of that I can have as many continuous or categorical variables as I want to predict.

If I can organize my data in Excel for gender and gender+race that is all I need to build 2 deep nets

Something that wasn’t mentioned with the downstream processing from Excel was the automatic facial localization and preprocessing detection. We grab the face using three deep networks in series to better isolate the face. If you look closely this is a demonstration on Taylor Swift’s face below:

Results in less than 5 minutes:
After submitting our first CSV in 3min 29.46s our deep net is trained for gender and we get:

Overall validation accuracy: 97.75%
Gender: Male/Female: 94.73%/97.88%
Gender: Black/White: 98.3%/96.6%

So black females went from being the lowest accurate prediction to the highest.

Prediction Gender AND Race:
Now submitting the dataset for simultaneous gender and race predictions we have a model trained in 4min 20.9s we get:

Overall validation accuracy: gender: 95.0%, race: 96.14%
Gender: Male/Female: 89.40%/96.50%
Gender: Black/White: 97.31%/93.42%
Race: Male/Female: 95.82%/97.35%
Race: Black/White: 91.88%/99.06%

It isn’t surprising to take a hit on gender accuracy when you are expecting the network to now learn two different things and predict them. Also, these networks are optimized for mobile deployment so they are very small. Our max accuracy model setting doesn’t take as much of a hit on multiple label combinations. For this blog post I am trying to show off time to value, hence the fast model demonstration.

Did We Fix Racist Gender Predictions With This Example?

Did we improve them for black/white differences? Yes, yes we did, we actually improved it so much we reversed it where now white males are the concern for accuracy. However, for this particular dataset our gender predictions will not generalize well over other races such as Asian, Latin, Pacific islander, Indian, etc…. This is why it is important for your dataset to represent as diverse a population as possible. With our global dataset we can accomplish this, exposing the network to the wonderful diversity we see globally.

Race Isn’t Discrete:
Humans have decided that race is discrete: Black, White, Latin, Asian, Indian, etc.. but in reality it is much more complicated than that. In my mind I see race as being infinite (i.e. Greenland is it’s own race) where with AI training on large global datasets we can work to protect minority groups. The complexity of race will be further discussed with our country of origin prediction problem being published this week in a coffee shop.

Deployment:
If we make a deep network we like we can now deploy it to Docker, Mobile, or inference cloud. This is how easy deep learning should be.

--

--

Ben Taylor

I am obsessed with deep learning and general AI. I spend almost every waking hour working on making the most complex technology approachable