The Struggle to Make Machine Learning Models Less Biased

Accurately using data collection as means to reduce discrimination, without impacting accuracy

Roberto Iriondo
DataDrivenInvestor
Published in
5 min readNov 21, 2018

--

November 21, 2018 by Roberto Iriondo

Image: Youth Laboratories | ylabs.ai

Artificial intelligence dirty little secret has been exposed, and to start no, is not the one about machines taking over the world. That one is “still a myth.” This one is more treacherous. Data Scientists, Research Scientists and others have since a long time ago predicted that it would become an issue. In any case, it is within the last three years, as companies’ AI (often some form of machine learning) has turned out to be almost universal in our lives, that the issue has become crucial.

As machine learning is currently being utilized to decide everything, from stock costs, advertising, marketing, to medical diagnoses. it has never been more important to look at the decision-making process of these machine learning models and algorithms. Unfortunately, a good portion of currently deployed machine learning systems, are prejudiced. Sexism, ageism, racism — you name it.

We are often quick to say that the way to make these machine learning models less biased, is to simply come up with better algorithms. However, algorithms are only as good as the data fed to them, and the results of using not prejudiced data can make a big difference on the results.

The issue is well documented: For instance, in 2015 Google’s photo application embarrassingly labeled some African-American individuals as gorillas [1]. An ongoing pre-print paper announced across the board human bias in the metadata for a mainstream database of Flickr pictures used to prepare neural systems. Even more disturbing was an analytical report a year ago by ProPublica [2] that discovered programming used to anticipate future criminal conduct — a la the film “Minority Report” — was one-sided against minorities.

Brisha Borden (portrayed right) was rated high risk for future crime after she and a friend took a kid’s bike and scooter that were sitting outside. No further offenses since. | Source: ProPublica — [3]

Anastasia Georgievskaya, a Research Scientist with Youth Laboratories [5] first experienced prejudice from machine learning when working on an AI-judged beauty contest application[4], which uses computer vision and machine learning to study aging, where almost all of the winners picked by the ML models were white.

Many researchers thought that discrimination by AI was likely, but only in a very distant future. Thankfully, a lot of work is being done nowadays as to come up with better solutions in order for future ML models to not discriminate against other individuals. At the end, algorithms can always be improved, however, ML systems can only learn from the data they are fed.

Irene Chen, PhD student at MIT CSAIL brings a new perspective to making machine learning models less biased, on her paper [6], she talks about the recent attempts to achieve fairness in machine learning, as to focus on the balance between fairness and accuracy. Nevertheless, in sensitive applications, involving healthcare or criminal justice, the trade-off is often undesirable — because, even a slight increase in prediction error could potentially have devastating consequences.

Gregory Lugo (portrayed left) crashed his Lincoln Navigator into a Toyota Camry while drunk. He was rated as a low risk of re-offending despite the fact that it was at least his third DUI. On the example above we can see a clear bias against women from Northpointe’s Software. | Source: ProPublica — [3]

Assistant Professor at MIT, David Sontag, mentions that their approach as seen on their paper [6] can potentially help machine learning engineers figure out what questions to ask their data in order to diagnose why their ML models may be making unfair predictions.

By far, the largest misconception is that more data is always better. However, when it comes to people that does not necessarily help, as drawing from the exact same population often leads to the same classification of subgroups being under-represented. Even one of the most popular image database ImageNet [8], with its gigantic image dataset, has been show to be biased towards the Northern Hemisphere [7].

Often the key is to collect more data from subgroups being discriminated against. One of the case studies presented by Chen’s paper [6] looked at an income-prediction system, which found that it was twice as likely to misclassify female employees as low-income and male employees as high-income. The group of researchers found that if they increased the dataset by a factor of 10, those mistakes would happen 40 percent less often.

Unfortunately, there is structural bias in the world, and people need to be aware of this.

It is important to continue to build awareness around issues of biases in machine learning, and invest heavily into scrubbing discrimination out of datasets currently being used. Technologies and algorithms have become essential and they surround us in every part of our life. It is imperative to design machine learning systems that treat us in the right way, in order to live without prejudices in the future.

DISCLAIMER: The views expressed in this article are those of the author(s) and do not represent the views of Carnegie Mellon University, nor other companies (directly or indirectly) associated with the author(s). These writings are not intended to be final products, yet rather a reflection of current thinking, along being a catalyst for discussion and improvement.

You can find me on: My personal website, Medium, Instagram, Twitter, Facebook, LinkedIn or through my web design company.

Don’t forget to checkout:

References:

[1] Google ‘fixed’ its racist algorithm by removing gorillas from its image-labelling tech | The Verge | https://www.theverge.com/2018/1/12/16882408/google-racist-gorillas-photo-recognition-algorithm-ai

[2] Machine Bias, there’s software used across the country to predict fugure criminals. And it’s biased against blacks. | ProPublica | https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

[3] How we analyzed the COMPAS Recidivism Algorithm | ProPublica | https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

[4] First International Beauty Contest, Judged by Artificial Intelligence | Youth Laboratories | http://beauty.ai

[5] Machine Vision and Artificial Intelligence for Beauty and Healthy Logevity | Youth Laboratories | http://ylabs.ai/

[6] Why is My Classifier Discriminatory | Irene Chen, Fredrik D. Johansson, David Sontag |Massachusetts Institute of Technology | https://arxiv.org/pdf/1805.12002.pdf

[7] A Deeper Look at Dataset Bias | Springer | https://www.springer.com/cda/content/document/cda_downloaddocument/9783319583464-c2.pdf

[8] ImageNet | http://www.image-net.org/

--

--