Understanding Algorithmic Bias

Condensing the ideas expressed in the Algorithmic Bias in Autonomous Systems paper.

Published in

The Research Nest

5 min readJan 25, 2022

Not long ago, I came across the paper “Algorithmic Bias in Autonomous Systems” by Dr. David Danks and Dr. Alex London and realized that we often refer to an imbalance in a model training data when we talk about bias.
But in reality, there are multiple categories of bias. They arise at every stage, from data sampling to algorithm design and processing to application and interpretation by end consumers (human or autonomous systems). And at each step, algorithmic bias raises a different set of concerns and possibilities.

While not all of them need to be corrected, those that have to be are usually not easy to resolve. This is because the answers involve determining what factors should and should not influence the outcome and how much they should influence it. Therefore possessing clarity about the concept of bias makes it easier to arrive at a solution.

The paper helped me discern between the different types of algorithmic bias and understand how they affect the functioning of autonomous systems. This clarity helped me realize that the required responses to mitigate them vary from problem to problem. There is no universal solution to this.

Types of Bias

Potential biases that might be introduced in an algorithmic value chain (Adapted from Silva & Kenney)

Training Data Bias

When a biased dataset is used, the resulting model reflects that bias (GIGO). This deviation in the data does not represent the model’s use case and thus prompts skewed outcomes and low accuracy levels.

Consider a computer vision model that detects vehicles intended for use throughout India. If the data used to train it is based on the traffic condition in the U.S., then it would perform poorly and not detect vehicles like tuk-tuks and rickshaws.

In this scenario, the algorithm’s output is accurate if we judge based on the input data. However, we observed bias because it does not perform well in the test scenario due to the flawed input date. This is a clear example of how a model’s “performance” varies according to the standard it’s accountable to.

Algorithmic Focus Bias

This type of bias refers to the inclusion and omission of specific input features present in the data based on legal, moral, statistical, and other reasons.

Imagine you have a dataset of customer sales in the U.S. and England. 97% of your consumers belong to the U.S., so it seems wise to delete the country location data since it is statistically insignificant. This is algorithmic focus bias.

However, your model will not realize that your British consumers spend two times than their American counterparts, and thus this omission would not be a good business decision. Therefore this process of filtering features becomes complicated once we start considering different standards.

Algorithmic Processing Bias

Algorithmic processing bias occurs when the algorithm by itself is biased. It happens through deliberate modifications to the algorithm, changing the weighting of the variables or the dynamics between them to counteract other types of biases present, thereby enhancing the algorithm’s performance.

Thus, such a model is considered biased, but as we noted earlier, certain biases like this need not be corrected.

Transfer Context Bias

When a model is deployed outside of its intended context, it experiences transfer context bias. It does not perform “well” according to relevant statistical, moral, or legal standards. Consider an algorithm employed to predict a particular outcome in a given population. We observe it delivers inaccurate results when applied to a different demographic — a form of transfer context bias. This biased performance is thus solely caused due to employing it for the wrong use case.

Interpretation Bias

Interpretation bias occurs when a user or an autonomous system misinterprets the algorithm’s output.

Take, for example, a manufacturing execution system at a factory. This system is dependent on multiple independent algorithms to optimize the manufacturing process. If it wrongly interprets the output of even one of the algorithms, then the system’s efficacy regresses, having experienced interpretation bias.

Responding to Algorithmic Bias

The first step involves determining whether the bias present is problematic. As we have seen, there are cases where one form of bias is utilized to offset the effects of another type. There might also be scenarios where the bias holds very little significance. In these cases, eliminating the bias is unnecessary and might not be advantageous.

If an algorithmic bias is identified as problematic, we must evaluate it via a thorough understanding of the algorithm’s role and the contexts in which it is deployed. This will help us understand the nature and source of the bias and help realize the nuances of the values and norms to which the performance of the autonomous system is accountable.

It is essential to comprehend the ethical and legal norms relevant to the context in which the algorithm is deployed. This ensures that the autonomous system does not experience transfer context or interpretation bias.

Once we have all the required knowledge, we can tackle the bias through multiple angles. Indeed, there’ll be cases where no type of solution: technological, social, or psychological, amongst others, fully rectifies the problem. In these cases, we are forced to confront what we value, as we cannot have an autonomous system that is unbiased with regard to every standard.

We must choose between algorithms with different biases to decide which is less detrimental. In many cases, this choice will be between algorithms that are either unbiased relative to performance standards or for compliance with moral and legal norms. These decisions demand that we look outside a restricted set of technology or even users, as they require judgments about relative values.