What is algorithmic bias?

To understand algorithmic bias, we must understand:

  1. Algorithm: Set of rules to be followed in calculations or other problem-solving operations, especially by a computer.
  2. Machine Learning: Development of computer programs that can access data and use it learn for themselves.

So, when a data scientist writes a machine learning algorithm, they are basically asking the computer to rely on past data, recognize patterns and predict the outcome using probability.

Algorithms are created by people and user data. Now, the coders will have certain intrinsic biases. The data might have some more. This bias is passes on to the machine learning algorithm. It may even be exaggerated by AI systems.

The blind spot of computer algorithms, which leads them to be discriminatory, is called algorithmic bias.

There are five common types of algorithmic bias:

  1. Data that reflects existing biases:

Do you remember the Friends episode “The One With the Male Nanny”? The episode revolves around everyone asking each other the same question — “What do you call a male nanny?”

If you search for nurses, you are more likely to get a female’s picture from google search than any other. If you search for police, you see more males than any other.

2. Unbalanced classes in training data:

How fair do you think an algorithm would be to the chances of a non-male winning US Presidential elections based on the history of 45 US Presidents?

3. Data that doesn’t capture the right value

To grade student essays, AI was trained to read and understand the intricate complexities of prose. It was even asked to write a fiction of its own. Here’s a bit of Harry Potter fan-fiction:

https://twitter.com/botnikstudios/status/940627812259696643

4. Data that is amplified by feedback loops

So you watched the movie on Netflix because you like it. Next, Netflix recommends you movies that are similar. Now, did you click on something because you were interested in it, or because it was recommended? The inability of algorithms created a ripple effect that will show you very divisive content.

5. Malicious data

“Tay” was a chatbot released by Microsoft in 2016, which enabled Twitter to chat with it as an experiment. The chatbot was fed all kinds of racist remarks, and within 24 hours, it started speaking the same too. Tay correlated conversation with racism, and picked up on it.

A good algorithm is the one which uses the most efficient path to outcome from its existing knowledge. There always will be outliers, and an algorithm, hence, will always be somewhat biased. Our goal should be to make our code more inclusive, more accepting to exceptions of norms.

--

--