AI In Criminal Justice: The Problem With Trusting Algorithmic Objectivity

The YX Foundation

Published in

The YX Foundation Journal

7 min readOct 10, 2020

by Charumathi Badrinath

Mentor: Prof. Isra Ali; Student Editor: Anu Zaman

20-year-old Dylan Fugett and 21-year-old Bernard Packer had a lot in common. They had both been arrested before, Fugett for attempted burglary and Parker for resisting arrest without violence, and were soon arrested again for felony drug possessions. The second time around, however, something was different: information about Fugett (White) and Packer (Black) was fed into an algorithm that assured the presiding Judge that while Fugett’s risk of reoffending was low, Packer was a high risk to the community. As time came to tell, this “future-predicting” algorithm which had favored Fugett was wrong: while Fugett was arrested thrice more, Packer had no subsequent offenses.

While the idea of algorithms predicting the risk that a defendant will re-offend in the future may seem dystopian, this sort of technology is already in use around the country. According to a 2019 primer on the topic for judges, prosecutors and defense attorneys, pre-trial risk assessments are “empirically based approaches,” which, using information from “more than 65 years of rigorous research studying factors that are statistically associated with public safety risks,” are allegedly able to predict a defendant’s likelihood of recidivism (committing another crime). The most widely used of these risk assessment tools is COMPAS, an algorithm developed by for-profit company Northpointe which, according to a practitioners guide published by the company, is intended to “inform decisions regarding the placement, supervision and case management of offenders” as well as “aid in correctional intervention to decrease the likelihood that offenders will re-offend.” To make this prediction, defendants are asked questions on a wide variety of often nebulous topics ranging from their educational level to so-called “criminal attitudes” which are then fed into the COMPAS algorithm to generate a recidivism “risk score” ranging from 1 (lowest risk) to 10 (highest risk).

While the COMPAS algorithm is a proprietary secret, the questions it asks defendants to calculate risk scores are publicly available. Here is a sample of 11 out of the 133 questions asked on the assessment, taken from every category:

Based on the screener’s observations, is this person a suspected or admitted gang member?
If you lived with both parents and they later separated, how old were you at the time?
How many of your friends/acquaintances have ever been arrested?
How often have you moved in the last twelve months?
In your neighborhood, have some of your friends or family been crime victims?
Were you ever suspended or expelled from school?
How often do you have barely enough money to get by?
How often did you feel bored?
I have never felt sad about things in my life. (Agree or Disagree)
If people make me angry or lose my temper, I can be dangerous. (Agree or Disagree)
A hungry person has the right to steal. (Agree or Disagree)

Vague questions like agreeing or disagreeing with the statement “a hungry person has the right to steal” immediately raise concern as the proprietary nature of the COMPAS algorithm makes it impossible for third parties to verify that answers to these questions are weighted fairly when deciding risk scores. Additionally, in 2014, the US Department of Justice expressed concern about using education levels, employment history, family circumstances, and demographic information as factors in deciding the risk score since they are oftentimes a consequence of systemic racism. When asked about potential bias in the algorithm, Tim Brennan, the creator of COMPAS said that it is difficult to construct a score without factors such as “poverty,” “joblessness,” and “social marginalization,” asserting that if these factors are omitted from the risk assessment, accuracy goes down.

Despite this, risk assessment technologies were adopted across the country as a way to quickly and effectively manage cases in overcrowded systems and decrease pretrial incarceration rates. Additionally, this technology gave court systems the ability to handle cases in a way where judges couldn’t be held liable for bias in their decisions.

This last factor — decreasing bias — is one that is often cited as the main reason for the adoption of AI technologies in the criminal justice system. Even when factors such as arrest location, offense committed, and criminal history remain the same, a 2014 University of Michigan Law School study found that there is “an unexplained Black-White sentence disparity of approximately 9 percent,” which increases to 13 percent when considering drug-related offenses as well.

Rather than being symptomatic of overt racism, this phenomenon is the result of covert racism in the forms of implicit racial bias and systemic racism, which 65% of judges surveyed by the National Judicial College agree exists in the criminal justice system. This correlation between racial bias and sentencing was demonstrated in a 2009 study by Cornell Law where researchers first conducted an Implicit Association Test (IAT) on 133 judges of different races, genders, and political affiliations and then sought to correlate IAT scores with judgments on cases. After the judges were “race primed,” (through a subconscious word-processing technique in which half the judges had Black-associated words flash in front of them while the other half saw uncorrelated words) they were presented with a case where the defendant’s race was not made explicit. Upon analyzing the results, researchers found a statistically significant correlation between judges’ scores on the IAT and their race-primed sentencing decision: judges with a higher IAT score were on average more likely to give a defendant they were subconsciously primed to think was Black, a harsher sentence.

The extent of the real-world effects of this inherent bias were analyzed in a 2018 study by the National Bureau of Economic Research which found that Black defendants are 2.4 percentage points more likely than white defendants to be detained before their court hearing, with an average bail that is $7,281 higher.

AI looked like a better option.

While the adoption of risk assessment technology has had its intended effect of reducing overall incarceration rates in some states, this quick rate of integration and overconfidence in the objectivity of data meant that states often didn’t conduct comprehensive testing before implementing the algorithm on a large scale. In New York, for example, COMPAS was implemented in 2001 and was rolled out to nearly all of the state’s probation departments by 2010. A comprehensive statistical evaluation of the tool wasn’t conducted until 2012, and even then, potential racial disparities were not evaluated. In 2014, Attorney General Eric Holder called the U.S. Sentencing Commission to study the use of algorithms in court, concerned that the scores may be a source of bias, and while the sentencing committee did conduct a study on the factors driving recidivism risk, it again did not look at the correlation between risk scores and actual recidivism.

In a recent study on the accuracy of the COMPAS algorithm, researchers at ProPublica gathered the risk scores assigned to over 7000 people arrested in Broward County, Florida from 2013 to 2014. Using the COMPAS definition of recidivism, they evaluated the accuracy of the risk scores and the disparities between risk scores and outcomes for Black and White defendants. The study found that the accuracy of the algorithm was rather low: only 61% of defendants with a medium to high risk score for two-year recidivism went on to commit a crime within this time frame. The accuracy rate for violent crime was only 20%.

The comparison between Black and White defendants revealed a far more shocking result. The algorithm falsely flagged Black defendants as future criminals at twice the rate as White defendants. Conversely, a larger percentage of White defendants assigned a low risk score went on to commit crime than Black defendants assigned a low risk score. Even after controlling for various factors such as prior crimes, age, and gender, Black defendants were still 45% more likely to be assigned higher risk scores. It became frighteningly clear that the algorithm judges trusted and depended upon for its “objectivity” was fraught with racial bias and worked to reinforce systemic racism. Not only could judges use risk scores to rationalize their own beliefs on race as being supported by “facts,” but the mislabeling of Black individuals as “high-risk” meant they faced prolonged removals from their community causing their families to suffer, and continuing the cycle of poverty and parental absence that so strongly influences the COMPAS algorithm.

Given the magnitude of racial bias built into risk assessment algorithms, it is disheartening to see how liberally risk scores are used and abused in our court system. While Wisconsin law states that risk scores should only be used to determine which defendants are eligible for probation and treatment programs in alignment with Northpointe’s recommendations, Judges in the state have cited scores in their sentencing decisions. In August 2013, Judge Scott Horne declared that the defendant in the case he was presiding over had been “identified through the COMPAS assessment as an individual who is at high risk to the community” and therefore sentenced him to 8.5 years in prison. Even Brennan admits that he doesn’t like the idea of “COMPAS being the sole evidence that a decision would be based upon.”

Despite the alarming overuse of AI technologies in the criminal justice system and the clear evidence of built-in bias, the National Institute of Justice (NIJ) is continuing to sponsor research into this field. According to a recent NIJ report, researchers at the Research Triangle Institute in Durham, North Carolina are in the process of creating an “automated warrant service triage tool” that will have the ability to determine the time until the next occurrence of an event of interest to predict the risk of reoffending for absconding offenders.

While AI technology has made promising inroads in other fields, its use in the criminal justice system has until now only served to legitimize the racial bias that the system is already plagued by. Rather, we must train those working in this system to be actively unbiased; fundamentally change the factors leading the wide disparity in arrest rates, incarceration rates, and bail sentencing between BIPOC and non-BIPOC; and be deliberately anti-racist in the questions we ask of criminal justice, to begin the journey of racial justice in America.

Sources

Image from https://medium.com/retinamagazine/can-a-judge-and-a-machine-do-the-same-job-9d5e478dc09

AI In Criminal Justice: The Problem With Trusting Algorithmic Objectivity

Sources

Written by The YX Foundation