Ethical AI Beyond Just Talking: Part One

Addressing model fairness in the artificial intelligence and machine learning space

Published in

Slalom Data & AI

8 min readMar 29, 2022

Look around and you will find machine learning (ML) and artificial intelligence (AI) tools being used to improve public health, optimize disaster response, reduce the risk of homelessness, and more. However, even when models are created by people with the best of intentions and using the best available data, AI and ML solutions can produce unfair biases that perpetuate harm.

For AI to fulfill the promise of delivering fair, accountable, and transparent decisions, it’s not enough to have good intentions. It’s critical to develop awareness about the dangers associated with its use — such as discriminatory treatment and privacy breaches — and the instruments that can be used to prevent or mitigate them.

In the first part of this two-part series, we’re addressing a core concern in the AI and ML space: model fairness.

When do we need to worry about model fairness?

In many data science projects, the notion of “algorithmic fairness” — or more appropriately, “model fairness” — won’t be a concern because the model isn’t dealing with human outcomes. Consider an AI system used by an agriculture business to predict the optimal amount of fertilizer, or a provider of manufacturing equipment to improve the schedule of predictive maintenance. In such cases, mistakes made by the model will not create a risk of discriminating against people by their skin color, gender, age, or another human characteristic.

Model fairness becomes critical whenever a model is used to predict something that will be consequential for individuals or groups. Here we will focus on three common scenarios:

1. Risk of unjust allocation of opportunities or resources

There is a risk of unjust allocation whenever a model contributes to allocating or withholding certain groups an opportunity or resource. It’s easy to see the danger of unfair outcomes when AI/ML is used to support decisions like the following.

2. Risk of uneven quality of service

The risk of uneven quality of service exists whenever a service may not work as well for one group of people as it does for another. The issue is perfectly illustrated by the film Coded Bias, a documentary that examines the impact of MIT researcher Joy Buolamwini’s discovery regarding the inequity in facial recognition software.

Other examples of harm in quality of service include:

A voice recognition system for ordering food that only acknowledges men’s voices well, failing to recognize requests from most women and children.
A biometric system used to grant access to a residential building that systematically denies access to non-Caucasian residents.

3. Risk of perpetuating damaging stereotypes

Algorithmic solutions may perpetuate harmful labels associated to underprivileged groups or place individuals from a group based on race, gender, age, economic status, or other characteristics.

Examples of AI/ML solutions that help perpetuate damaging stereotypes and lead to disparities in outcomes include:

A sexist translation engine that translates “Men should clean the kitchen” to “Women should clean the kitchen.”
A search engine that automatically selects photos to illustrate online content and consistently picks men to represent doctors and women to represent nurses.

What causes models created by well-intentioned teams to exhibit unfair behavior?

Bias in training and test data are common causes of mistakes made by ML models that translate into unfair treatment. Consider a face-recognition system built by a company claiming a more than 97% accuracy rate. This high level of accuracy was achieved using samples that were more than 77% male and more than 83% white. When deployed in the real world — where the percentage of white males is much smaller — the model loses accuracy, with a disproportionate impact on people who don’t belong to the primary training sample group.

Another common mistake in trying to achieve model fairness is the adoption of simplistic solutions. This could include forbidding access to elements like racial or gender data when those elements are seemingly irrelevant to the task at hand. As addressed in The Ethical Algorithm: The Science of Socially Aware Algorithm Design, there are two serious problems with this approach:

Removing explicit reference to features like race and gender does not guarantee that the resulting model won’t exhibit some form of racial or gender bias. Models can easily learn to deduce those characteristics from other inputs such as zip codes, car models, computers, and phone types.
Determining whether any information about individuals is “irrelevant” for a prediction or classification isn’t a simple task. Removing a model’s access to features like race and gender often will reduce model accuracy and might even make a model more biased against a racial or gender group.

What can be done to mitigate bias and unfair treatment in AI/ML models?

Because there are so many ways for algorithms to deliberately or inadvertently circumvent efforts to enforce fairness, fair treatment can’t be achieved by simply trying to restrict the inputs given to an ML model.

For the time being, effective solutions to mitigate bias in AI/ML remain firmly in the domain of human decision-making. The following steps provide a framework to help organizations design predictive models and AI/ML systems that are optimized for more than raw accuracy, taking the human and societal context of fair treatment into account.

1. Define “fairness”

There isn’t a universal definition of fairness that can be applied to most situations. The first step in pursuing “fair models” is to define what “fairness” means in the context of your specific business problem.

Predictive models are inevitably imperfect in that there will always be a certain rate of false rejections. In this context, a false rejection could be a model denying a loan to an applicant who would have repaid or rejecting a job seeker who would be the best candidate for the job.

In many allocation problems, a valid way to define fairness is equality of false rejections. While achieving complete equality of false rejections may be impossible due to inherent model and data flaws, the goal is to ensure that the discrepancy doesn’t disproportionately impact people that belong to a certain group based on race, gender, ethnicity, age, disability status, or another relevant characteristic.

2. Pay close attention to the data used to train and test models

Many organizations end up with unfair models because they don’t put enough thought into their training and testing data.

The founder of a startup recently asked if I would be willing to validate the accuracy of a predictive model as an independent auditor. A quick conversation showed that their model — which was meant to serve the general population — had been trained and tested using a heavily skewed public data set.

The first step to mitigate the risks of performance degradation and unfairness in the deployed model would be to address the mismatch between the algorithm input and the real world by seeking training data that reflects underrepresented groups.

3. Introduce fairness considerations as part of the learning process

The primary goal of ML is to maximize predictive accuracy. However, as seen above, this is sometimes achieved at the expense of minorities and other underrepresented groups.

One approach that can prevent models from being unfair to an underrepresented group is to introduce constraints in the training process that stop optimizing for the majority group. That said, fairness constraints come at a cost of accuracy. By definition, mitigating the bias in a racially-, gender-, or age-biased model will result in a less accurate model.

In other words, a fairer model (e.g., one that offers similar rates of false rejection across different groups and subgroups) may result in the model making more mistakes overall, such as rejecting more loans of creditworthy applicants or increasing the number of innocent people who are incarcerated.

Fortunately, the trade-offs between accuracy and fair treatment can be quantified to support informed decisions that fight discrimination while still preserving as much accuracy as possible. In some cases, the best solution may be to create two or more models. An example of this could be a model that predicts a medical diagnosis for men and separate model for women, rather than a single model that sacrifices the accuracy of the predictions in order to achieve an equality of false negatives between the groups.

4. Assess the fairness of the final model

Once the final model is ready, it’s time to apply statistical analysis to determine whether it’s resulting in unfair treatment of a population based on the adopted definition of fairness. Fairlearn is an example of an open-source tool that can be used for this purpose. Its website also provides tools to introduce constraints and quantify the trade-offs mentioned in step three.

5. If fairness issues are found, mitigate them

If any issues surface in step four, it may be necessary to go back to steps two and three in order to improve the quality of the training data or improve the constraints used to reduce the disparity of treatment among groups. If these improvements aren’t feasible or degrade the overall model accuracy due to the unavoidable performance-fairness trade-off, it may be necessary to accept the model limitations and adjust its use accordingly.

For example, imagine that an organization is trying to decide which applicants should receive a scholarship based on their likelihood to graduate within five years. A data science team builds a model to make predictions based on historical data, feeding an algorithm information such as age, education, and employment history, along with the outcome of each former student. The model “learns” how to predict collegiate success of candidates who went to traditional schools, but not of homeschooled candidates.

It may be impossible to mitigate the issue by adding constraints to the training process without substantially degrading the predictive accuracy for traditional school candidates. Rather than accepting lower accuracy to improve fairness, this model could be used to predict the successes of students from a traditional school setting while creating a separate decision process for homeschooled applicants.

What about legal ramifications?

Under many circumstances, adjusting or constraining ML models to counteract unintentional bias will be crucial for the thoughtful and responsible deployment of AI systems that impact humans. These actions may be essential for compliance with anti-discrimination laws as well.

However, it’s also possible to face liability when applying bias mitigation techniques like the ones described above. This is particularly true when decisions are based on legally protected characteristics such as race, religion, sex, national origin, age, or disability status.

A key tactic to ensure that model design decisions are legally permissible is to establish evidence that such decisions were made to mitigate the risk of unfair treatment of specific groups. For more information on this topic, see Is Algorithmic Affirmative Action Legal?

If algorithmic models are so prone to flaws and bias, why not stick to human judgment?

While it may never be possible to create ML models that are perfectly fair to all groups and individuals, the same issues arise when using approaches that don’t rely on ML algorithms. Because humans are biased in ways that machines are not — and because algorithmic models have the potential to dramatically improve the efficiency and fairness of many consequential decisions — the best way forward is to continue to research and apply state-of-the-art techniques to promote the ethical and responsible deployment of AI/ML systems.

This post is part one of a two-part series on Ethical AI. Here is a link to the second post on algorithmic privacy.

Slalom is a global consulting firm focused on strategy, technology, and business transformation. Learn more and reach out today.