How to build fair ML models?

Understanding the process to build fairness in your ML models

Vimarsh Karbhari
Acing AI
4 min readJun 16, 2020

--

Researchers at Microsoft Research surveyed practitioners from 25 ML product teams in 10 major technology firms, and found that the “fair ML” research literature focuses too specifically on methods to assess biases, and would benefit from focusing more broadly on the full machine learning pipeline.

As the world grapples with the existing crisis today, building fair models should be the first conversation data professionals should be having. It is not a feature it is table stakes.

Photo by Elena Mozhvilo on Unsplash

It should start with the source of the data itself. The data pipelines which actually deposit and transform the data could be leveraged to introduce or improve fairness in data which would translate into fair ML models.

One approach taken by the data science teams and the machine learning community is explicitly “defining fairness” as in, proposing different metrics of fairness as well as approaches to encoding that into machine learning pipelines.

Steps for introducing fairness in the ML models:

  • This first step is the most common place where unfairness gets introduced in the models. We need to understand what general people perceive to be fair decision making. Given an explicit definition of fairness, is it understood, and is it acceptable to a wide audience? If particular subgroups of the general population do not comprehend parts of automated systems that impact them, then they will be more easily disadvantaged. It is important to view the data in this context as well.
  • The first step was general population, the next one is to understand and interpret what experts say. We need to understand what specialists perceive to be fair decision making and what tools they would need to help them do their jobs. This could mean developing tools to help audit systems, or to better curate high-quality and well-sampled input datasets, or to permit faster exploratory data analysis (EDA) to help find holes in the input and output of prototype or deployed systems.
  • After definitions and tools, we need to develop techniques. Given a definition of fairness or of bias, these techniques can measure at enterprise-scale whether or not an ML-based system is adhering to that definition or those definitions. They also provide deviations to describe by how much and alert humans and experts, when appropriate, if the system deviates beyond an acceptable level.
  • Once we have all of these above steps in place. Let us say we iteratively build a system. Once the system provides an output, it should also be able to demonstrate the level of fairness. Effective user experience will be required to allow stakeholders to comprehend the state-of-the-art in various fielded automated systems. Communicating the state of a system in accordance of particular definitions of fairness and bias in a human-understandable way is paramount.
  • After the system is live, the team will surely encounter bugs. Quoting directly from the Microsoft Research study discussed earlier, “another rich area for future research is the development of processes and tools for fairness-focused debugging.” Debugging tools with a fairness focus would help teams identify the under-sampled portions of an input dataset, or previously overlooked subgroups being adversely impacted by new decision making rules.
  • Going with the overarching theme, we need to develop shared languages and channels between all involved parties, but particularly engineers, domain experts, general people, and policymakers. Engineers implement, policymakers make society-wide rules and general people are impacted by the interaction between the two. All three need to understand the wants, incentives, and limitations of the others through open and continuous communication.

Recommendations

This process will require in-depth discussions with stakeholders from general people, policymakers, lawyers, and domain experts. Yet, these discussions will need to be complemented with accurate and scalable techniques that measure and communicate real-world systems adherence to various definitions of bias and fairness in machine learning. They should be able to provide human feedback to further improve automated decision systems performance in practice. Improving fairness would be an ever evolving process but the team will need to keep sharpening the saw to improve fairness in their data and models.

References:

Fairness and Machine Learning book by field experts Barocas, Hardt, & Narayanan

Subscribe to our Acing AI newsletter, I promise not to spam and its FREE!

Thanks for reading! 😊 If you enjoyed it, test how many times can you hit 👏 in 5 seconds. It’s great cardio for your fingers AND will help other people see the story.

--

--