Can AI be biased?

7 min readFeb 9, 2023

A survey of biases that have been shown to be present in many machine learning models.

Over the past few years, there has been a lot of talk about bias in artificial intelligence (AI) systems. People have questioned whether AI is biased against certain groups of people whether it be in medical, educational, or financial industries. AI has invaded nearly every space in society and has developed major social impact. If the answer to the question concerning AI bias is yes, what are the implications of such bias? A bias in an algorithm can have a huge impact on decisions it makes and actions it takes. It can affect the jobs people get, credit they get extended, loans they get approved for, schools their children get admitted to, or even the medical diagnosis a patient gets.

But how does bias creep into AI algorithms? In this article, I cover two of the most common forms of bias that are present in AI systems, and bring awareness to the harm they can cause.

What introduces bias to AI?

AI systems, specifically machine learning models, can become biased after learning from real-world data. Bias from data can be introduced by the way the data is selected, or from neglected correlations. Bias in AI can also occur when the system’s learning design is not fair and unbiased, resulting in the system making discriminatory decisions simply from its architecture. When biases like these are introduced into a machine learning model, they can result in systems with discriminatory or unfair decisions.

Other forms of bias that I do not discuss have been shown to negatively affect decision-making processes. This includes inaccuracies in measurement, classification of data, misdiagnosis, recall bias, missing data, or cognitive bias. To avoid these types of bias, it’s important to consider the potential impact of human bias in designing AI and ensure that it is operating as intended. Let’s go into detail about some of the biases I mentioned.

Confounding bias

Confounding bias occurs when an AI system fails to recognize a confounding variable in its learning data, leading the AI to have a biased understanding of how the information it is trained on affects the outcome it is trying to predict.

It might be hard to think about what this means so lets look at a classic example called the healthy worker bias.

Assume we want to build an AI model that predicts the probability that a person can safely escape a dangerous fire in a building. There may be two factors we consider in their chances of survival: if they are physically fit and if they have any firefighting training. Both of these attributes, while not the only factors, directly give someone an obvious advantage in their safe exit of the building. Something you may have already thought is that if someone is a trained firefighter they probably are already physically fit, and you would be right. Unfortunately, humans are much better at finding these implicit associations than computers. AI does not find hidden associations unless its designer includes it. An engineer that overlooks something like this could allow for bias in the model. We can easily visualize this issue by observing the following diagram:

Directed Acyclic Graph (DAG) of Healthy Worker Bias

In the above chart, arrows indicate a cause and effect. Since being a trained firefighter is a cause of being fit and a cause of increased probability of a safe exit, it is considered a confounding variable. An easy way to identify this type of bias is to draw these causal diagrams and see if there is more than one route from a given variable to the outcome. In this example, the outcome has a path to the “trained firefighter” variable directly as well as through the “physically fit” variable, therefore, there is confounding bias.

This bias would result in the model inaccurately assessing the impact that these factors have in a person’s survival. For example, a journalist may see and publish a story that the model claimed being physically fit increased probability of survival by 20% when in reality it is 10%.

Non-randomized studies (such as observational studies) are vulnerable to confounding bias if they do not take into account all potential confounding variables that may be at play. In general, confounding variables that are ignored can lead to a misunderstanding of the effect each variable has on the outcome.

Ultimately, bias in general can limit the validity and generalizability of study results and should be avoided in many situations. It is important to understand and mitigate bias in order to make accurate scientific claims and reach conclusions that can be generalized to other situations. Lets look at another type of common bias.

Selection bias

Machine learning and AI can be biased not only due to ignored associations between variables, but the selection of the data itself used to build the algorithm. This is called selection bias. Models can become biased after learning from real-world data if the input data is not representative of the target population. This does not mean the data has to represent every person or thing in the world, but the “population” you wish to make generalizations about needs to be represented by the information that is in your data.

For example in pharmaceutical studies, selection bias may occur when those individuals selected as controls are less healthy than those selected to take the drug. Lets say we had some arbitrary way to measure someone’s health on a scale of -5 to 5, where -5 is poor health and 5 is outstanding health. We can visualize the difference in health a drug may have on a population. If a pharmaceutical company developed a drug that did nothing to improve your health, there may be temptation to display results to appear that there is a positive effect using selection bias. Lets say they present the following chart:

It would appear that those prescribed the drug have a higher health rating than those who didn’t. Just looking at this chart would encourage someone to buy their drug. But, if someone else were to show you the next chart of what the same people looked like before they took the drug, it may not tell the same story:

While there was some random change, it is obvious the drug is not the cause of the prescribed group’s better health because they were healthier before they took the drug. This is why governing bodies like the FDA require pharmaceutical companies to exclude biases like selection bias from their study through clinical randomized trials to ensure studies represent the true effect a drug may have on the population.

If you were wondering, a randomized clinical trial on these same people with the same useless drug would look something like this:

What are the impacts of AI bias?

Bias in AI systems can cause harm to individuals and businesses alike. A common example of this type of bias is discrimination, where AI systems make predictions based on a person’s skin color, gender, religion, or other personal attributes. This limits the validity and generalizability of study results, making it difficult for researchers to draw valid conclusions from their work. Despite this, these generalizations may be made regardless, and a lack of awareness of these issues lead to biased AI being trusted by individuals, researchers, and businesses. Without properly addressing and mitigating these forms of bias in AI systems, there is a risk that they could perpetuate societal dilemmas and lead to unfair outcomes that would not exist otherwise. There are numerous stories in the news that discuss the results of these heavily impactful models. Here is a famous example from a D.C. school district:

D.C. Schools Chief Rhee Fires 241 Teachers Using New Evaluation System

Washington, D.C. schools Chancellor Michelle Rhee announced Friday the firing of 241 teachers who did not meet…

www.pbs.org

Whether a given model is biased, unfair, or inaccurate may be subject to the goal at hand. Even an unbiased model can have a net negative impact in a system if the goal set is different than the intended outcome. AI is only as good as the data and engineering that goes into it.

With the proper safeguards, AI systems can be effective tools for making decisions without being discriminatory.

As with any new technology, there are growing pains in mastering its use. There is still much to be optimistic about in the future of AI. It is able to perform human functions more effectively, efficiently, and at a lower cost. This can be assisting healthcare organizations in diagnosing and classifying patients, maintaining and tracking medical records, and dealing with health insurance claims. Additionally, AI can provide additional insight into the behavior of customers in order to optimize customer experience.

With proper safeguards, AI systems can be effective tools for making decisions without being biased. Organizations must carefully consider the vision for their AI initiatives and outline clear objectives that incorporate considerations such as fairness, safety, privacy, dignity, and accessibility.

AI systems can be biased in various ways which harm our society but proper adjustments when designing algorithms can keep them impartial. If you are curious about other ways AI has impacted society, read my article discussing ChatGPT, and its future with Microsoft:

ChatGPT Presents a Real Challenge to Google

How language models might change how we use the internet.

medium.com

Joshua Anderson is currently working towards a Ph.D. in Intelligent Systems at the University of Pittsburgh researching fairness in medical AI models.

Liked what you read?

Click here to see my other articles: https://medium.com/@talkai

Disclosure: some of this article was written with the help of AI-assistive technology