Amazon Scraps Secret AI Recruiting Engine that Showed Biases Against Women

AI Research scientists at Amazon uncovered biases against women on their recruiting machine learning engine

October 11, 2018 by Roberto Iriondo

Credit: The Verge | “It is the mission of our generation to build fair AI.” ~ Omar U. Florez

Distinguished Professor Stuart Evans mentioned during a lecture at Carnegie Mellon University how biases in machine learning algorithms can negatively affect our society, whether these are unconsciously added through supervised learning or missed upon audits with other types of machine learning. In this case Amazon’s AI research team had been building a recruiting machine-learning based engine since 2014, which took care of reviewing applicant’s resumes with the aim of intelligently automatizing the search for top talent.

Quoting an AI research scientist on the team: “Everyone wanted this Holy Grail,” one of the people said. “They literally wanted it to be an engine where I’m going to give you 100 resumes, it will spit out the top five, and we’ll hire those.” However, by 2015, Amazon realized its new system was not rating candidates for software developer jobs and other technical posts in a gender-neutral way.

Amazon’s recruiting machine learning model was trained to vet applicants by analyzing certain parameters in resumes submitted to the company over a 10-year period. Due to the biases that the machine learning model had, most ideal candidates were generated as men, which is a reflection of the male dominance across the tech industry — therefore the data fed to the model was not unbiased towards gender equality but au contraire.

Amazon’s research team states that they modified the central algorithms and made the machine learning model neutral to these gender biases, however that was not a guarantee that the engine would not device other ways of sorting candidates (i.e. male dominant keywords in applicant’s resumes) that could prove discriminatory.

Employers have long dreamed of harnessing technology to widen the hiring process and reduce reliance on subjective opinions of human recruiters. Nevertheless, ML research scientists such as Nihar Shah, whose research is in the areas of statistical learning theory and game theory, with a focus on learning from people at the Machine Learning Department at Carnegie Mellon University, say there is still much work to do.

“How to ensure that the algorithm is fair, how to make sure the algorithm is really interpretable and explainable — that’s still quite far off,” Professor Shah mentioned.

Credits: Han Huang | Data Visualization Developer | Reuters Graphics

Masculine dominant keywords on resumes were pivotal after the modification of the algorithms on the machine learning models from Amazon’s recruiting engine. The research group created 500 models that focused on specific job functions and locations. They taught each to recognize over 50,000 parameters that showed up on applicants’ resumes. The algorithms ultimately learned to assign a low percentage of significance towards skills that were common across all applicants, i.e. programming languages, platforms used, etc.

Final notes:

It is important for our society to continue with the focus towards machine learning, but with special attention to biases — which, sometimes are unconsciously added on these programs. Thankfully Amazon’s AI research team was able to recognize such biases and act upon it. Nevertheless, rhetorically speaking — what if at the end, these biases were not recognized, subsequently adding such biased ML decision engine towards general day to day talent recruiting at the company?

The impact, along the consequences would have been atrocious.

I am always open to feedback, please share in the comments if you see something that may need revisited. Thank you for reading!

DISCLAIMER: The views expressed in this article are those of the author(s) and do not represent the views of Carnegie Mellon University, nor other companies (directly or indirectly) associated with the author(s). These writings are not intended to be final products, yet rather a reflection of current thinking, along being a catalyst for discussion and improvement.