Ethics for powerful algorithms (3 of 4)

7 min readAug 2, 2016

(Hi, all! Apologies for the long radio silence — my day job has been all-consuming.

For those of you joining us for the first time, this series is about the controversies/risks/concerns around using algorithms in the criminal justice system. You might want to check out my first post here, and the second post here, to come up to speed on the controversy around COMPAS.)

COMPAS is an algorithm used to predict which criminals are most likely to commit future crimes. It’s controversial because it’s widely used in decisions about bail and parole — who gets to walk free when.

This series takes COMPAS as a case study in the ethics of powerful algorithms. So far, we’ve seen that:

COMPAS isn’t statistically biased, despite claims by investigative journalists at ProPublica to the contrary. (In addition to my earlier analysis, you can see independent reviews here, here, and here.)
However, the algorithm could still be contributing to deeply unfair outcomes anyway.

My goal in this post is to move the conversation around COMPAS away from statistics and onto underlying values.

I’ll start by citing a side we haven’t heard from yet: policymakers who support algorithms like COMPAS as a tool for prison reform. From there, I’ll lay out the (often conflicting) goals our society has for its criminal justice system, and evaluate how predictive algorithms can amplify those values.

A warning: we’ve moved past the realm of easy certainty. Resolving these questions requires balance among competing values. I can lay out the tensions and tradeoffs, but I can’t promise The Answer.

In defense of algorithms

Last time, I argued that an algorithm like COMPAS might be deeply unfair, even if it’s not statistically biased. Of course, there are potential benefits as well. Supporters of algorithms argue that they offer a way to create a more humane and efficient prison system, by

shortening needlessly harsh sentences for prisoners with low risk of re-offending.
reducing overcrowding in state and federal prisons.
providing guidance to parole boards who may be inexperienced or lack the time to engage deeply with each case.
serving as a check on human biases, like racial prejudice or crankiness before lunch (really).

These impacts are real, and they’re hard to write off. Here’s a paraphrased excerpt from GCN, a government tech and IT magazine. This is from 2013, before the controversy around COMPAS went mainstream.

At least 15 states, looking to cut costs on incarceration, now require their prison systems to use some form of risk assessment tool in evaluating inmates, and many of them are turning to predictive analytics software.
The software programs measure factors such as inmates’ age when first convicted, education, whether they think their conviction was justified and whether they’re married. Some programs measure 50 to 100 factors overall, in contrast to the relative handful weighed by parole boards, many of whose members are political appointees without much or any training in criminology.
Adding software-driven assessments appears to be having an effect. Populations in state and federal prisons fell by 1 percent in 2011 and appear to have fallen further in 2012, according to reports available. And that’s at least partly because of a drop in recidivism: the percentage of parolees going back to prison dropped from 15 percent in 2006 to 12 percent in 2011.

In other words, tens of thousands of people are walking free early because of predictive algorithms. At the same time, the recidivism rate has dropped. Shorter sentences; less crime. What’s not to like?

A fundamental tension in values

This is a bigger issue than COMPAS. 2.4 million Americans are currently incarcerated. We have more people in our jails than China, and proportionally more minority prisoners than South Africa at the height of apartheid. Why do we keep all of these people locked up?

There’s no consensus on the right answer to this question. Here are four of the most common responses, with ethical implications for algorithms like COMPAS.

Because they deserve to be punished for behaving badly.

In my second post, we saw that COMPAS relies in part on questions about Family Criminality, Peers, and Social Environment. For example:

If you lived with both parents and they later separated, how old were you at the time?
How many of your friends/acquaintances have ever been arrested?
In your neighborhood, have some of your friends or family been crime victims?

Under this definition of fairness, these questions make COMPAS unfair. I am not accountable for the actions of others. My sentence should not be longer because my parents separated, or my friend was once arrested.

We could solve this problem by stripping out questions for which the defendant isn’t morally responsible — but doing so would make the algorithm less accurate.

This is an inescapable tradeoff for anyone trying to make fair predictions, whether you rely on an algorithm or your own judgement. To the extent that our actions are influenced by our environment, there will always be a direct tension between respecting individual agency and accurately predicting future outcomes.

There will always be a direct tension between respecting individual agency and predicting future outcomes — with or without algorithms.

2. To protect innocent people from future crimes.

Let’s run the numbers.

The Bureau of Justice Statistics reports 4.7 million paroled prisoners (“under community supervision”) in 2013.
The GCN article asserts that risk scores have caused a 3% drop in recidivism among paroled criminals.

If that’s the case, then some back-of-envelope math says forecasting tools like COMPAS are preventing 140,000 crimes per year. That’s a lot of would-be victims. Should their property, wellbeing, and lives figure into our decision making?

The counterargument is that predictive parole moves us dangerously close to a Minority Report approach: punishing people for crimes not yet committed. To some extent, our laws already criminalize the future by allowing prosecution for “conspiracy to commit” or “attempted” crimes. Are we comfortable moving further in that direction?

This tension between protecting criminals’ rights and protecting others from potential harm arises every time a police officer walks a beat, a judge sets bail, or a board makes a parole decision. The dilemma is older and deeper than technology. Algorithms simply draw attention because they make the risks visible and explicit.

These dilemmas are older and deeper than technology. Algorithms draw attention because they make the risks visible and explicit.

3. To rehabilitate criminals.

Paths to rehabilitation are a tragically neglected aspect of criminal justice.

Cathy O’Neil recently asked some productive questions about the impact that algorithms could have in this domain. Here’s a paraphrase:

I’ve heard people call for removing recidivism models altogether, but honestly I think that’s too simple. I think we should instead have a discussion on what they show, why they’re used the way they are, and how they can be improved to help people.
If we’re seeing way more black (men) with high recidivism risk scores, we need to ask ourselves: why are black men deemed so much more likely to return to jail? Is it because they don’t have job opportunities when they get out of prison? Or because their families and friends don’t have a place for them to stay? Or because the cops are more likely to re-arrest them because they live in poor neighborhoods or are homeless?
Second, why are recidivism risk models used to further punish people who are already so disadvantaged? It keeps them away from society even longer and further casting them into a cycle of crime and poverty. If our goal were to permanently brand and isolate a criminal class, we couldn’t look for a better tool. We need to do better.
These are all questions we need to answer, but we cannot answer without data. So let’s collect the data.

(By the way, Cathy’s book Weapons of Math Destruction is worth a read. It’s a worst-case perspective on the harm that unchecked algorithms can inflict on society.)

If you want a positive example, take a look at Cherry Tree Data Science. The brainchild of HR analytics master Zev Eigen, Cherry Tree helps employers reduce risk from hiring applicants with criminal records.

Like COMPAS, it’s based on a recidivism algorithm, but unlike COMPAS, the inputs to Cherry Tree’s algorithm prioritize indicators of personal intent over socioeconomic background. The algorithm is designed to provide former convicts with a long-term path away from crime, starting with a steady job. At the same time, employers can tap a labor pool they would otherwise have shied away from.

In other words, Cherry Tree was built with forgiveness in mind. Even though some of the math is the same as COMPAS, the values — and ultimate impact — are very different.

4. To run a more efficient prison system.

Prisons are run by fallible human beings with imperfect information, limited budgets, and complex legal and political constraints. No one argues that prisons exist solely to run themselves efficiently, but policymakers have to take these realities into account.

That’s what led to the development of tools like COMPAS to begin with: a booming, overcrowded prison population.

I imagine that when procurement officers send out RFPs for algorithms like COMPAS, they say, “Make it as accurate as possible. And make sure it’s not racially biased or we’ll never hear the end of it.”

As we saw earlier, evidence to date suggests that algorithms like COMPAS do a good job solving the problems they were intended to solve. They’ve helped alleviate overcrowded prisons and shorten harsh sentences without increasing recidivism rates.

Where next?

These are the values at stake in the debate about algorithms in courts and prisons. Ultimately, they’re less about statistics and data than competing goals in criminal justice.

Algorithms for parole have been driven largely by the fourth value: efficiency. Given the evidence, it’s hard to argue that algorithms like COMPAS haven’t produced significant gains in efficiency. They’ve also been effective on #2: preventing future crimes. But in the process we’ve compromised on fairness (#1) , and made little or no progress on rehabilitation (#3).

Are these tradeoffs worth it? As I said at the beginning, there’s no easy answer to this question. I have my views — and I’m looking forward to sharing them (along with suggestions for moving forward) in my next post.

Ethics for powerful algorithms (3 of 4)

In defense of algorithms

A fundamental tension in values

Written by Abe Gong