Organisations need to assess and understand their AI-driven systems to ensure equitable outcomes when it comes to algorithmic fairness.
The growth of Artificial Intelligence (AI) applications across industries, means that more and more critical decisions are being handled by algorithms. This stretches from banking, legal, and media, to fashion. Algorithms enable organisations to cope with the colossal volumes of data needed to drive digital products and services at scale. And with AI forecasted to add $13 trillion to the global economy by 2030, we can expect to see algorithmic decision-making become increasingly pervasive. The question is: What effect will this have on our society and our everyday lived experience? There’s room to be hopeful about this when it comes to decision making processes; a large corpus of research indicates that algorithms are often more accurate and less biased or prone to prejudice compared to human decision-making. This promise of algorithmic objectivity makes sense if we consider the definition of machine learning:
“Algorithms parse data, learn from that data, and then apply what they’ve learned to make informed decisions.”
On the face of it, it looks like the main challenge for the data scientists and engineers who create these algorithms is to train them with the right data to ensure accurate outputs, and that fairness comes automatically with the elimination of humans from the decision-making process. And yet there’s an increasing number of stories emerging about unfair outcomes wrought by algorithms. This is because algorithms can learn to become biased because they are “programmed by human beings, whose values are embedded into their software. And they must often use data laced with all-too-human prejudice”. Algorithmic bias is often simply a reflection of our society’s messy past, littered as it is with historic injustices. Historical discrimination that has been eliminated by legislation can lay dormant and be hidden in the data, only to get resurrected and even amplified by algorithms.
Some organisations believe that algorithmic bias is an engineering problem that can be solved through statistical methods. However from our experience, ensuring equitable outcomes is more than a data science challenge, it’s a complex issue that required collaboration and input across the broader organisation. At The Dock, we’ve been working on this problem of algorithmic bias. By translating nascent academic research into a practical tool, we are making algorithmic fairness achievable for our clients. The Fairness Tool began as a hackathon challenge put to the Alan Turing Institute by Accenture’s Responsible AI team. The aim was to evaluate different forms of fairness from a data science perspective and to come up with an initial concept for a practical tool.
Building on that initial concept produced by the Alan Turing Institute, we began our research by making the following assumptions:
- Algorithmic fairness is, or will soon become, a concern for our clients in terms of their data workflows due to regulations and increasing public awareness,
- Data scientists need a tool to enable them to assess data models for bias and to recommend how to make repairs if required to ensure fairness,
- Data scientists require better tools to communicate their findings on algorithmic bias to business users and key decision-makers to agree on the right course of action,
Over the course of twelve weeks, our multidisciplinary team of designers, software engineers and data analysts developed a tool that allows organisations to understand the impact of their algorithms. It provides a way to identify and measure bias, as well as functionality to rectify or mitigate the impact of that bias. The trade-off between fairness and accuracy is an important consideration. It is quantified and visualised in the tool to support informed decision-making. In this way, we empower data scientists to clearly communicate the technical challenges involved when attempting to remove bias and achieve more equitable outcomes.
In his essay on ‘Explorable Explanations’, Bret Victor described ‘reactive documents’ that would allow the reader ‘to play with the author’s assumptions and analyses, and see the consequences’. This idea of giving the user the ability to explore resonated strongly with us. We wanted to build a tool that bridged the communication gap between data scientists working on algorithmic bias challenges from a technical perspective and a broader audience of decision-makers and business owners.
We started by examining the typical data modelling process used in algorithm development. Based on our research we mapped out an initial, general data model building workflow. A series of workshops and interviews with data scientists at a financial institute and experts from Accenture, enabled us to refine our map. We looked specifically at the credit risk modelling use case.
We used an activity called ‘Insights-Actions-Outcomes’, in which we asked the data scientists on the team to complete as many cards as possible in the following format:
The aim of this exercise was to identify as many useful features and functions of the tool as possible. We then asked participants to place the cards on the workflow according to the timing. This exercise enabled us to figure out where we could most easily integrate the tool into the current data science workflow with minimal disruption or additional effort.
In data science there’s a broad range of fairness metrics to chose from. Our tool implements three of those; mutual information analysis, disparate impact and predictive parity. They were specifically chosen because they allow us, in this first iteration of the tool to be as flexible and general as possible.
Mutual information analysis focusses on the data used for training the model. It reveals the relationships between variables that are protected against use in data modelling often by law, and non-protected variables. This allows the model builders to avoid using attributes that may be proxies for a protected variable. For example, race is a protected variable and cannot be used in a model to determine whether or not someone should get a loan. However a ZIP code is not a protected variable but it can become a proxy that can indicate a person’s race.
The second metric, disparate impact, also focusses on training data but quantifies the impact of variables on outcomes for one subgroup of people over another. It helps to ensure that all groups are treated fairly. Our tool also enables the user to ‘repair’ or modify data in order to mitigate disparate impact. The aim is to decrease your ability to predict a protected variable from any of the other variables in your dataset, for example you should not be able to predict someone’s race based on their ZIP code. This function is based on an academic paper that was identified during the Turing hackathon.
Our final metric, predictive parity takes the focus away from revealing bias in the data and looks instead at how we may adjust the outputs of the model to ensure equitable outcomes. Essentially it aims to ensure that error rates or the rate of incorrect predictions are the same for all subgroups in your dataset. For example a credit risk model should not be more lenient towards males compared to females.
Designing the experience
The fairness tool starts with the data scientist and is integrated with Jupyter hub, the typical working environment for the data scientist. We wanted to seamlessly add fairness as a step in the current data science workflow, because if it ends up taking significant additional effort or time during the model building process it will be impractical. In our current tool workflow, analyses are composed by the data scientists using our templates and pushed to an online tool for business users. The business user can explore and choose the most appropriate analyses to embed into a final report for disseminating amongst the business owners and other decision-makers. The final collection of analyses are brought together in one coherent report and published to the repository for consumption.
We have engaged with an ecosystem spanning from the IEEE to the Alan Turing Institute, to industry partners, building our knowledge in this space. Our resulting prototype is not just a prescriptive tool, because ensuring algorithmic fairness it is not that simple. It fosters a deeper understanding of data science challenges for a broader, more diverse audience to help key decision-makers in organisations to take more meaningful actions. The main goal for this tool is to help balance technical solutions for algorithmic fairness with a more holistic perspective. Often times a technical fix may overlook the social dimension, or fall short of ethical standards, or may not be viable from a business perspective. Our tool facilitates that communication, democratising AI and rendering complex issues more transparent and accessible.
Many thanks to my fellow team members on the Algorithmic Fairness project at The Dock for sharing their knowledge and expertise, and also to my colleagues Grace, Fiona and Connor for valuable feedback on this article.