The ethics of algorithmic fairness
Once applied to risk assessment in the criminal justice system, are we deceiving ourselves on the wrong track?
This article questions the current undertakings of the ethical debate surrounding predictive risk assessment in the criminal justice system. In this context, the ethical debate currently revolves around how to engage in practices of predicting criminal behaviour through machine learning in ethical ways ; for example, how to reduce bias while maintaining accuracy. This is far from fundamentally questioning for which purpose we want to operationalise ML algorithms for; should we use them to predict criminal behaviour or rather to diagnose it, intervene on it and most importantly, to better understand it? Each approach comes with a different method for risk assessment; prediction with regression while diagnosis with causal inference . I argue that, if the purpose of the criminal justice system is to treat crime rather than forecast it and to monitor the effects on crime of its own interventions — whether they increase or reduce crime — , then focusing our ethical debates on prediction is to deceive ourselves on the wrong track. Let us have a look at the present situation.
In matters of ‘ethical’ prediction of criminal behaviour, the branch of algorithmic fairness has recently had a lot to say. Exponents of algorithmic fairness have identified numerous ways in which statistically-driven methods like machine learning can reproduce existing patterns of individual prejudice and institutionalised bias . While some focus on reducing bias in the design process, some focus on reducing it in the outcomes. Overall, they emphasise the importance of predictive parity as an explicit goal; that is, the systems we use should not only be equally accurate, but also have similar accuracy rates over all test groups (e.g. different racial groups or genders) to which they are applied . Sadly, Kleinberg et al.  proved that no mechanism can achieve optimal accuracy and optimal predictive parity and that a trade-off is needed between the two. Others responded by presenting alternative conceptions of fairness. Nevertheless, often, conceptions of fairness not only differ, but they are also at odds with each other. This is shown by the work of Berk. et al. . This identifies six kinds of fairness to then show that not only these notions conflict with accuracy, but also with one another.
Is the impossibility of accuracy in the face of fairness a problem? It is if we frame the added value of our algorithms to be predictive accuracy. However, in the criminal justice system, is this their added-value or in that they can effectively support judges in making better decisions about whom to release, and under what conditions (i.e. how should the criminal justice system intervene in an individual’s life to mitigate specific, relevant risks)? In the first case, we vouch to measure the utility of an algorithm as a predictive tool, while in the second as a diagnostic tool.
Predictive tools are often based on regression analysis. Regression enables researchers to identify variables that are predictive of an outcome of interest, without necessarily having to understand why that factor is significant . For example, given high risk of criminal re-offence, regression analysis would identify the variables that are correlated with it, like anti-social behaviour or criminal history. However, it would not offer any insights on why this correlation happens. Furthermore, it treats this ‘risk’ as a statistical fact about the world, as static. In front of the predicted statistically high-risk of someone re-committing a crime given his anti-social behaviour, regression can only suggest us what to avoid to tackle this ‘given’ risk; to release the criminal. It forecasts crime cycles in order to arrive there first, to beat time and to play crime’s own game, rather than to treat it. Instead, diagnostic tools present risk as a dynamic phenomenon, as something that can be mitigated through interventions. The next paragraph will explain this better.
When we talk about using statistical tools for diagnosis, we refer to causal inference. Through causal inference we can hypothesise and test causal relationships between covariates and the outcome variable of interest . Here, the risk of someone re-committing a crime is framed as an outcome on which the effect of different covariates — e.g. anti-social behaviour, unemployment… — can be tested in their causal import. ‘Risk’ here is presented as a dynamic phenomenon, as something that can be changed by intervening on what causes it. Now, why is this important? I think it is important for two main reasons. First, because it is in the interest of the judicial system to learn how to treat crime. Second, because it is also in our interest to monitor the effects on crime of the criminal justice system itself. How can causal inference grant this?
As per ‘treating crime’, it grants it in two ways. On one hand, it allows us to isolate and test its potential causes and cures. In causal inference, causality is inferred by randomly assigning individuals or groups, referred to as units, to an intervention or treatment . This happens especially in medicine. Each unit subjected to a treatment may realise an outcome of interest, and upon receiving no treatment may realise an alternate outcome, also known as the counter-factual. Randomly assigning units to both ‘treatment’ and ‘no-treatment’ group and comparing the potential outcome after the application, gives a measure of the causal effect of the chosen intervention . The random assignment of units to treatments ensures a “balance” of the covariates — potential confounding factors — , thereby isolating the applied treatment as the causal driver (or not). For example, given two randomly-assigned groups of convicts, where one is subjected to behavioural therapy while the other is not, it is possible to assess the effect of behavioural therapy as an intervention on criminal or violent behaviour. While these trials are well-established in the medical field, the criminal system often rejects this possibility for ethical concerns .
On the other, we are also able to measure the impact of timing and duration of the applied intervention, an advantage severely lacking in regression-based methods and something crucial if we want to effectively intervene on crime . For example, research has shown that the timing of the initiation of behavioural therapy has an effect not only on prison conduct of defendants, but also on the risk of recidivism . A causal inference framework can suggest when it is best to initiate behavioural therapy as an intervention.
As per examining the potential effects on crime of the criminal justice system itself, causal inference allows us to separate covariates that are not impacted by intervention from intermediate outcomes that our interventions impact . This is important to estimate the effects of interventions of the criminal justice system on crime itself. For example, anti-social behaviour is often not found to be the driver of re-offence. Instead, anti-social behaviour is often shown to increase with intensive policing and thus, it is an intermediate outcome of the criminal justice system intervening efforts to reduce crime rather than a separate covariate.
Overall, matters are not so easy. Often, efforts to conduct this kind of research are hindered by ethical concerns — concerns that strangely, do not prevent this research to be conducted in the medical field. Additionally, it is not always possible to test our hypothesis under experimental conditions , especially when what is tested are the potential drivers of crime. However, alternative methods and strategies are often used by applying similar methods to observational data . Notwithstanding the limitations, the potential benefits and insights into what drives crime as a structural problem rather than a statistical fact at least deserve more attention.