The Downsides of Algorithmic Choices

Calypso Leonard
4 min readApr 27, 2022

Recently, I’ve been learning Ruby and Rails, with the hope that I can supplement my knowledge of front-end construction with a more robust backend framework. I find the decisions around shaping my data structures to be an engaging challenge, but the more I think about building a backend database, the more I think about working with larger and larger sets of data within that framework. In short, I’ve been thinking about how datasets and databases can build in errors and biases that create unfair outcomes beyond the digital world. It’s led me to doing some interesting reading on the subject of algorithmic bias.

Algorithmic bias, briefly, is a catch-all term for systemic or repeated trends in a computer system that lead to unequal outcomes, like privileging one group of users over another. It’s most prominent and generally most concerning when computer models are designed to take in massive data sets and make actionable assessments about individuals. Organizations from major social media companies to small nonprofits know the benefits of data driven marketing and policy decisions: information about user behaviors is a powerful tool in the drive to make efficient and effective strategy choices. However, in an increasingly data-driven world where “AIs” — or in reality machine learning algorithms — are gaining more and more credence in sectors from health care to finance to criminal justice, systemic biases in the data and analysis can have massive damaging impacts for users and consumers.

One study found that an algorithm designed to help guide health decisions for the U.S. healthcare system systematically deprioritized Black patients, directly leading to less care provision for sicker Black patients and untold harm through lack of medical intervention. Another case involved an algorithm designed to assess a defendant’s risk of recidivism, which was implemented in Wisconsin’s criminal justice system. Despite its usage throughout the state sentencing system, very few people understood what data was used to assess risk, or how the commercial algorithm actually calculated a score. An investigation by ProPublica found that the software was routinely overestimating the risk that Black defendants would commit another crime in future and consistently mislabeling White defendants as low risk. There are also less dire but still problematic consequences of algorithmic bias — if you train a model using images of Christian churches, it could have a hard time identifying a synagogue as a religious space.

The idea behind using algorithms to make outcome assessments is a noble one — taking the human touch out of care and policy choices in order to eliminate the influence of personal bias towards a particular identity. But we need to acknowledge that computer systems are only as smart as we make them, and only as noble-minded as we train them to be. If you build a sorting algorithm that just happens to favor one group over another, it is your responsibility to identify and correct that bias early on, before you use it to make real world choices. Similarly, if you decide you use an algorithm to help make complex decisions with human outcomes, it is your responsibility to keep a close eye out for inequitable patterns your algorithm creates.

Well designed and trained sorting algorithms do hold a lot of promise, it’s likely a lot faster and easier to fix biased algorithms than to fix biased people, but we need better guidelines and, frankly, a lot less faith in computers to make sure we’re building a truly fair system.

A group of researchers at the Brookings Institute proposed three key steps to ensure algorithms are working in an equitable way. First, they call for regulators to define bias in practical, real-world terms. Creating a clear definition makes it easier to build benchmarks for fair outcomes that can be used in development and after a system has been deployed. Second, they call for policy makers and regulators to use these newly defined goalposts to provide guidance and targets for investigation into biased algorithms. Finally, they call for specific accountability structures and documentation protocols to preempt biased algorithms being deployed.

One relatively simple way to reduce bias is to ensure that all training data fed to a learning algorithm is carefully vetted not just for real world accuracy, but also for inherent bias. If all of your past data reflects a long history of biased sentencing, for instance, it should be no surprise that your algorithm starts to replicate that pattern in sentencing recommendations. Developers selecting training data need to be very aware of how their own experiences are coloring the input. But even the best training data will not be enough without oversight during implementation.

For rapidly developing technologies, it can be incredibly difficult to build clear guidelines or even consistent vocabularies — particularly when those in charge of market regulation do not have a clear grasp on what those technologies actually do. Advances in computing power make it exponentially easier to process massive amounts of data and come up with complex recommendations. That’s why we need to develop systems and habits around machine learning and other complex algorithms that make the benchmarks for intended unbiased outcomes clear from the start, before the technologies get too complex for us to reasonably comprehend.

I don’t at all pretend to have all the answers for how we can identify all potential sources of bias, select the best training data, or create guidelines and measures to catch and prevent biased outcomes. But I am glad that these are major issues in the field that I’ve learned about at this early point in my learning, and I hope that more people like me, who are excited to be joining the world of developers, technology, and data-driven decisions, will start to keep the dangers of these biases in mind.

--

--

Calypso Leonard

Software engineer currently studying at Flatiron School in New York