The increasing work and research on Artificial Intelligence (AI) applications across industrial functions has resulted in automation and intelligent operations that augments human work, to better focus on managerial functions that involves judgement and creativity. This means the critical decisions that were being handled by humans are now being handled by algorithms in operations that stretches from asset intensive industries like chemicals, mining, and oil and gas, to creative intensive industries like media, and fashion.
Data fueled applications that drive digital products and services at scale need more than just automation. To handle the colossal volume of data generated by humans and machines connected by internet, we need algorithms that will work responsibly, unbiased, and consider the dynamic changes taking place in the society. And with AI forecasted to add $13 trillion to the global economy by 2030, there’s no doubt that algorithmic decision-making will become increasingly pervasive.
Let’s take a step back and visit the definition of Machine Learning that we all know of — “Algorithms parse data, learn from that data, and then apply what they’ve learned to make informed decisions.”
So we understand that data scientists, engineers and architects design and build algorithms by training them with the right data to ensure broadly two things:
That’s a learning process in itself. The right data for accuracy is a challenge as it can only be achieved after rounds of modelling and making changes to the training data. The definition of this right data therefore will change predominantly for different situations. For instance, an algorithm that needs to identify customers who are most likely to buy a product from a luxury fashion boutique will have a different design and structure from an algorithm that needs to identify customers who are most likely to place an order for a particular meal from a vegetarian restaurant. Although both algorithms would have certain elements in common, for instance, recency, frequency, and monetary value, amongst others, certain variables would be drastically different from any two algorithms which depends on the nature of the business.
That comes with the elimination of humans’ personal judgments from the decision-making process. This is the most important and a tricky challenge because there are high chances that in the learning process, these algorithms tend to become biased because they are programmed by humans, whose values, thoughts, and opinions are transferred into the AI software. This means the complex program will need transparency to shed understanding on the automated judgement. The fear of automated decision-making at financial, legal, and technology firms is that it often becomes self-serving, abusing secrecy for profit in the greed for personal interests.
Why is algorithmic fairness challenging?
Algorithmic bias is a reflection of our society’s messy past, littered with archaic bureaucracy. Any form of discrimination that has been eliminated can still lay dormant in the data only to get resurrected, partly because of unintentional programming into the software, which gets amplified by algorithms.
On the face of it, algorithmic bias looks like an engineering problem that could be solved by econometric and statistical methods. However, ensuring an unbiased, equitable, and an ethical outcome is beyond a data science challenge. It requires human intervention for setting up the program for AI to learn. And this comes with great responsibility and a powerful mind that incorporates the highest level of fairness in the program.
How do we quantify Fairness?
In data science, there’s a wide range of metrics and methods to choose from.
The trade-off between fairness and accuracy is an important consideration.
To account for fairness and accuracy it is important that the model meets the following two conditions:
- A generic model that progressively moves towards a specific model through iterations to make it relevant to the situation
- Flexibility to modify factors, and variables, and data to ensure unbiased outcomes
Based on these considerations, I would suggest the combination of the following three methods for quantifying fairness and keeping a good level of accuracy:
Mutual Information Analysis focuses on the raw data used for training the model. It is a good metric to understand the relationship between protected variables (variables that are not meant to be used in data modelling), and non-protected variables (variables that can be used in data modelling). For example, an algorithm that decides whether or not someone should be short listed for an interview cannot use gender in the model as is a protected variable. Another example for a protected variable is race that cannot be used in a model to determine whether or not someone should get an education loan. On the other hand, variables like test scores and punctuality of payment of bills are not protected variables, but unfortunately these variables can become a proxy that can indicate a person’s race in the course of training the data. This can potentially be resolved by adopting the next metric in combination with this metric — Disparate Impact.
Disparate Impact also focuses on training the data, and reveals the significance of variables on outcomes for one cluster of people or items over another. This will help to ensure that all clusters are treated equally by making sure that the significance of each variable on every cluster is similar. In the process of data training, the ability to predict a protected variable from other variables in the dataset becomes strong in many cases. For instance, in many cases, the gender of a person can be predicted based on the test scores, which creates a bias in the algorithm. This predictive power of determining the protected variable can be selectively reduced by modifying the data in an iterative process, and mitigating the impact of certain variables that predict the outcome of the protected variable.
Predictive Parity focuses on adjusting the outcomes of the model to ensure equality and fairness. The objective essentially is to reduce the error rates, or rate of incorrect predictions, and to ensure that it is the same for all clusters in the dataset.
Sometimes an algorithm overlooks a social dimension, falls short of ethical standards, or ignores some business objectives. In such cases, complete transparency on these complex algorithms will democratize AI, making it possible to be more responsible, ethical, and practical.