What Is ‘Equity As Code,’ And How Can It Eliminate AI Bias?
This article was originally published in Forbes.
Engineers unleashed artificial intelligence (AI) bias, and it will be engineers who design the solutions that eliminate it. Authors of an article published by McKinsey Global Institute assert that “more human vigilance is needed to critically analyze the unfair biases that can become baked in and scaled by AI systems.” That’s an important start. The industry can also adopt a proactive, process-oriented approach to addressing AI bias. We have the tools to create data analytics workflows that address AI bias. When our work processes for creating and monitoring analytics contain built-in controls against bias, data analytics organizations will no longer be dependent on individual social awareness or heroism.
What Is AI Bias?
Machine learning (ML) models are computer programs that draw inferences from data — usually lots of data. ML is part of artificial intelligence (AI), which is a broader term used to describe computers and software that perform tasks in a way that we humans consider “intelligent.” ML and AI are being applied to tasks in nearly every industry to help companies more effectively and efficiently execute tasks and achieve goals. You’ve probably encountered AI innumerable times in the course of being an average consumer.
ML models are being used to aid disease diagnosis like detecting cancer cells, tag photographs in social media, understand speech, identify credit card fraud, increase customer engagement with movie/TV recommendations on video streaming services and much more. The global AI market is projected to grow at a compound annual growth rate (CAGR) of 33% through 2027, drawing upon strength in cloud-computing applications and the rise in connected smart devices.
One way to think of ML models is that they instantiate an algorithm (a decision-making procedure often involving math) in software and then, at relatively low cost, deploy it on a large scale. The problem is that algorithms can absorb and perpetuate racial, gender, ethnic and other social inequalities. There are, unfortunately, many examples — here’s a well-known one that was caught early:
Amazon developers disclosed that an AI model, designed to screen job candidates, favored men over women. The algorithm had been trained using a database of its engineering hires over a 10-year period. Since the training data contained a majority of male developers, the AI model taught itself that men were preferable and downgraded references such as “women’s team captain” or mentions of an all-female educational institution in a resume. If Amazon had not recognized the problem, the AI algorithm might have been deployed on a large scale, further perpetuating existing gender biases.
Addressing AI Bias With DataOps
Many in the data industry recognize the serious impact of AI bias and seek to take active steps to mitigate it. As the industry’s understanding of AI bias matures, model developers are getting better at defining and measuring bias. Data teams should formulate equity metrics in partnership with stakeholders. Once targets are defined, data professionals can iterate on eliminating bias from machine learning models. Armed with a comprehensive set of metrics and target goals, data scientists can address AI bias like other performance requirements.
The data industry can begin the process of mitigating bias by viewing AI systems from a manufacturing process perspective. Machine learning systems receive data (raw materials), process data (work in progress), make decisions or predictions, and output analytics (finished goods). We call this process flow the “data factory,” and like other manufacturing processes, it should be subject to quality controls. The data industry needs to treat AI bias as a quality problem.
If you walk into any modern manufacturing facility, you will see automation and quality controls at every step. When you buy a car, you can be sure that the factory has tested every component and subsystem. Additionally, the vehicle contains built-in computers that diagnose issues and control dashboard warning alerts. The car is tested before it is sold and then monitored while in operation.
AI systems should be subject to this same level of process control. The data industry employs a new term, “ DataOps,” when describing the application of manufacturing quality methods like lean manufacturing and Six Sigma to data and analytics. Let’s discuss how DataOps can address AI bias.
Equity As Code
In a traditional software development lifecycle process, new code undergoes DevOps automated testing before deployment. Tests defined in the continuous integration and deployment pipeline check if the code is ready for production.
AI models can be tested for AI bias as part of their pre-deployment testing. In the example above, Amazon developed a test showing that its model favored male resumes. The Amazon example also specifically illustrates the value of testing training data for bias before model development. An AI model and training data should undergo a battery of equity tests and measurements at every lifecycle stage. Anti-bias controls and metrics can be instantiated in tests applied to AI model performance to determine whether the AI model is adhering to equity requirements. A quality test suite may enforce “equity,” like any other performance metric.
Machine learning systems differ from traditional software applications in that ML systems depend on data and data changes continuously. As data flows, a deployed model may drift out of the target range of accuracy. A deployed model must be continuously monitored for bias and other quality issues while in operation. Each time a model is updated, it must undergo testing before being deployed. Continuous testing, monitoring and observability prevent biased models from deploying or continuing to operate. We call this new approach to mitigating AI bias “equity as code” because the tests that enforce equity are built into automated software applications that test, deploy and monitor the model 24/7.
DataOps “equity as code” provides the approach and methodological tools to impose equity controls on AI algorithms. A program of automated testing and continuous monitoring can help avoid deploying AI systems that instantiate and perpetuate inequities at scale.
Originally published at https://datakitchen.io on October 28, 2021.