3 Strategies to Fight Discrimination in AI Applications

What are the different approaches to mitigate bias?

Published in

Just-Tech-IT

3 min readFeb 3, 2022

Unwanted bias has been identified as major risk to the wider adoption of artificial intelligence (AI). In my previous article, I discussed the challenge of defining the fairness objective for an AI application. In the present one, I outline the different strategies available to actively mitigate biases in AI models.

Many bias mitigation strategies for machine learning (ML) have been proposed in research over the last years. While the underlying concepts vary considerably, they all share the same goal: to ensure that the decisions taken by an AI system satisfy some definition of fairness. The different approaches can be divided in the following 3 distinct groups.

1. Pre-processing

Efficient bias mitigation starts at the data acquisition and processing phase since the source of the data and also the extraction methods can introduce unwanted bias. Therefore, a maximum of effort must be put into validating the integrity of the data source and in ensuring that the data collection process includes appropriate and reliable methods of measurement. Prior to the era of “big data”, most data were collected by questionnaires. This allowed the development of experimental designs to control possible biases by statistical analysis. Today, technology provides us with large amounts of data at low cost, however, information about the conditions under which the data were collected is often rare.

Hence, algorithms which belong to the pre-processing family ensure that the input data is balanced and fair. This can be achieved by suppressing the protected attributes, by learning fair representations, or by changing class labels by reweighing or resampling the data. In some cases, it is also necessary to reconstruct omitted or censored data in order to ensure the data sample is representative. There exist plenty of imputation methods to achieve this objective, and the hot deck procedures belong to the most efficient ones.

2. In-processing

The second type of mitigation strategies comprises the in-processing algorithms. Here, undesired bias is directly mitigated during the training phase. A straightforward approach to achieve this goal is to integrate a fairness penalty directly in the loss function. One such algorithm integrates a decision boundary covariance constraint for logistic regression or linear SVM. In another approach, a meta algorithm takes a fairness metric as part of the input and returns a new classifier optimised towards that fairness metric. Furthermore, the emergence of generative adversarial networks (GANs) provided the required underpinning for fair classification using adversarial debiasing. In this field, a neural network classifier is trained as classical predictor, while simultaneously the ability of an adversarial neural network to predict a protected attribute is minimised.

3. Post-processing

The final group of mitigation algorithms follows a post-processing approach. In this case, only the output of a trained classifier is modified. A Bayes optimal equalised odds predictor can be used to change output labels with respect to an equalised odds objective. A different paper presents a weighted estimator for demographic disparity which uses soft classification based on proxy model outputs. The advantage of post-processing algorithms is that fair classifiers are derived without the necessity of retraining the original model which may be time consuming or difficult to implement in production environments. However, this approach may have a negative effect on accuracy or could compromise any generalisation acquired by the original classifier.

So what

Many technical strategies have been proposed by the research community to fight unwanted bias in AI. These can be divided into 3 major groups: Algorithms of the “pre-processing” group mitigate bias which exists in the training data. “In-processing” approaches tackle bias during the learning phase. Finally, “post-processing” algorithms adjust the output labels of trained classifiers.

Many thanks to Antoine Pietri for his valuable support in writing this post.