Getting started with fairness assessment in your Machine Learning pipeline

Amit Poonia
Trustworthy AI
Published in
4 min readJun 19, 2020

So you have a machine learning pipeline and it works, but it is possible that whatever prediction or classification it does, is biased towards certain classes, there could be many different reasons for this, and the bias could even be intentional in some cases, regardless, the bias may have have ethical implications, and in future maybe even regulatory and legal too, and can be termed as unfair.

How to detect the (human) biases which contribute to creating an unethical and/or unfair machine learning system? A typical ML pipeline has following stages, more or less, in each of these stages there is possibility to introduce some human biases that may lead to an unfair system. The goal should be to keep track of human activity in every stage, track and assess how it eventually affects the final prediction.

Data Collection: Sampling from skewed demography (e.g. most of the contributors for data are male, or from certain demographic) would introduce the bias from the get go in your raw data.

Data Processing: The annotator of data is from a certain demographic or age group, and/or if you don’t have multiple annotators to review or to do the voting based annotation it is highly likely that some human bias is introduced, and maybe even incorrect annotation.

Feature Engineering: Human specific attributes are used to create the feature set for training the model, could be the biggest source of the bias. Imagine using personal metadata of the training sample for features, and making predictions based on that personal information. This is something of a hard issue for some industries, especially in AdTech, recommendation systems etc. where personal info and history works as the main data.

Model Training: The training algorithms themselves are not really biased, but depending on which algorithm you use it could be easy to interpret its behaviour, logistic regression for example. On the other hand Deep Learning methods are more like a blackbox and not easy to interpret. It goes without saying that unless quality and performance are drastically different, a simpler model should always be preferred.

How to avoid human bias? There are some relatively easy steps that can be undertaken to deal with above mentioned issues.

Sample data from demography which is diverse wrt gender, ethnicity, nationality etc.
Same principle while annotating or processing data of if there any other similar stage in the pipeline
During feature engineering don’t use the human attributes as feature, this could be hard in some domains
But how to actually achieve all this, and how to quantify human bias?
Maintain metadata for each data sample and keep track of all that is “human” and involved in one way or another.
Example:
Some sample data with one field “text”, and let’s say later some annotation like “intent” to be added.
{“text”: “navigate me to London bridge”}
With metadata about the source of the data:
{
“sample”: {“text”: “navigate me to London bridge”},
“metadata”: {“source_gender”: “male”}
}
Extended metadata about the annotator
{
“sample”: {“text”: “navigate me to London bridge”, “intent”: “navigation”},
“metadata”: {“source_gender”: “male”, “annotator_gender”: “female”}
}
In the beginning something naive can be used to quantify how generalised is the data, “Shannon entropy” can be used to assess how diverse is the dataset with respect to a particular attribute of the field of some data and metadata, for example gender attributes in above example. This assessment can be done whenever the training data is changed. This also poses a conundrum, we need more data, personal data specifically, to track that data is diverse in its source and transformation, so we need to keep privacy in mind, and be careful how this data is collected and maintained.
Once we handled our data pipeline, now time to focus on the modelling part. Even if a dataset is quite balanced and diverse, it is still possible that during feature engineering certain attributes are used as features and resulted in an unfair trained model. To assess how the feature set is impacting the prediction, and if a certain feature introduces a bias and skewing model prediction, following is a non-exhaustive list of such tools.

1. Fairlearn (https://github.com/fairlearn/fairlearn)
2. LIME (https://github.com/marcotcr/lime)
3. Shap (https://github.com/slundberg/shap)
4. WhatIf tool (https://github.com/pair-code/what-if-tool)

5. Interpret (https://github.com/interpretml/interpret)
6. AIF360 (https://github.com/IBM/AIF360)
In conclusion, keeping track of who is creating, changing, augmenting the data, how diverse are the human actors in your pipeline, keeping some sort of score/metric for that. And, assessing the trained models using freely available tools, if some feature is disproportionately / unfairly affecting prediction. This setup not only helps you with making the system more fair, but also more robust, you will have a better understanding of it and will help you to find even non-human biases.

--

--