Bias in the AI court decision making — spot it before you fight it
Machine learning in court decisions
Use of machine learning in different decision-making processes, including in judicial practice, is becoming more and more frequent. As the court decision have a great impact on the individual’s personal and professional life as well as on the society as a whole, it is important to be able to identify and ideally rectify the bias in the artificial intelligence (AI) system to avoid that the model renders an unfair or inaccurate decision, potentially amplifying the existing inequalities in our society.
Advantages
The goal behind using machine learning in court decision-making should be to make the decisions and the decision-making process better, that is more accurate, just, as well as faster and less costly. Perhaps surprisingly, AI model can actually help judges uncover and fight their own bias. The system can alert the judge when, based on historical statistical data, it detects in the judge’s language that he or she is less attentive and is about to make a snap decision or is being less empathic. The model could do it by weighting in external factors, which could impact the decision-making, such as the time of the day, temperature or even the fact that the election period if near. The impact of these external factors on judges’ decision-making has been shown during a study of parole judges in Israel, as described by Daniel Kahneman in his Thinking, Fast and Slow. The observed judges would suffer from a so-called glucose depletion, which caused that they were more prone to grant a parole after the meal break, with the approval spikes reaching 65 % after each meal, whereas on average only 35 % parole requests got approved.
Status quo
Obviously, at the current stage of technology, only certain court decisions or parts of the court decision-making can be made by algorithms. AI has been used in the so-called predictive policing, where based on the available data the algorithm helps the police or the court to decide on a particular aspect of the case. This can be e.g. granting of parole, deciding on the bail and determining the appropriate sentence. The courts could use such a software for example to assess the risk that the defendant would commit another crime while on a parole, whether he would appear for the court date if bail is granted or whether a probation shall be considered. Additionally, machine learning has also been used in actual rendering of the verdict. The cases involve usually small civil law disputes, incl. overturning or deciding on parking fines. Estonia has recently unveiled its pilot robo-judge that would adjudicate disputes involving small monetary claims.
Bias in AI court decisions
Despite the advantages being numerous and with the technologically advancement, potentially endless, a risk-averse approach to the deployment of AI in court decision-making is crucial. Prior to launching any algorithm to replace a human judge in rendering the verdict, we have to be sure it will render a decision at least as just and justifiable as the human judge would. The European Commission’s High-Level Expert Group on AI considers an AI system to be trustworthy when it is lawful, ethical and robust.
The biggest issue of consideration when talking about an ethical AI is a presence of a bias, be it a conscious or an unconscious one, in the algorithm itself or in the data, as it can impact and distort the calculation and prediction process.
- The bias can be the result of mistakes in the sampling and measurement, causing the data to be incomplete, based on too few data or simply wrong. This would be a situation when due to a negligence of the data collection or production process ends up using bad data. Such a bias can in theory be corrected for by re-acting the data collection and production system and including the missing data or replacing the bad one. However, if the data is missing because it simply does not exist at first place, it would be a more difficult task to correct the bias. That can happen when for example certain types of crimes are in practice not investigated and thus a group of criminals not persecuted because of biased police practices.
- The data can also carry in itself a prejudice reflecting inequalities in the society. The most present underlying bias concerns racial and gender inequalities, as well as those related to social background of the person and his or her sexual orientation. This can be reflected in the content itself as well as the language of the data, but it does not have to be mentioned explicitly. For example, the way the case facts and circumstances or the defendant’s actions are described can carry an information bias as to defendant’s race or social class. A algorithm predicting a risk associated with a certain defendant’s behaviour built upon historic data from a district with a higher level of racial intolerance could reflect the law enforcement’s disproportional targeting of Afro-Americans, resulting in overrepresentation of such a data in the final poll of collected data.
- Additionally, the bias can be caused by data that is dirty (as opposed to clean good quality data) also when it reflects or is influenced by a fraudulent information, falsified documents, planted evidence or other manipulated or unlawful facts. Such a bias, if uncovered, moves the use of the whole AI system into a potentially illegal sphere and increases the accountability risk for the developer and the user.
Fighting the bias
A model built upon and using bad or dirty data runs the risk of further propagating discrimination and inequalities in the society by increasing the disconnection between the output of its work (the decision) and the social values (equal treatment, just process etc.), ultimately rendering an unjust or inaccurate decision. It is thus vital to discover and eliminate the unfair bias (as opposed to one which was introduced on purpose) before, or in a worse case scenario, after the AI system is deployed.
As the AI systems used in the public sphere, including by courts, are developed by predominantly private companies, unless explicitly programmed that way, they do not carry an inherent commitment towards protection of the justice or human rights. In other words, the system is not ethical unless made to be so.
AI model due diligence
Ideally, AI, due to its great potential impact on our lives and our basic human rights, would be regulated and overseen in a similar, if not stricter way, as other important sectors, e.g. aerial traffic, health system or law practice. There are currently numerous initiatives on-going on the national, European as well as International level to define the key principles of regulation of AI (see OECD AI Principles). However, any well-defined regulatory regime for now seems more of a utopia than something achievable in a short-term. In the meantime, to make sure the an AI system deployed now would render as fair and accurate of a decision as possible, the software has to be subject to an continuous audit process.
No matter how accurate it seemed when first deployed, the court using it would have to make sure, on an on-going basis, it is performing consistently and rendering fair decisions that would compare quality-wise to those of the human judges. Although introducing a corrective measure to eliminate an identified bias could be in theory feasible, the practice has shown making the AI system ethical by allowing for some variables to be artificially introduced is very difficult.
Individual case due diligence
Following an audit of the AI model itself, the second layer of the overall AI court decision-making model is ensuring it is verifiable from the perspective of the individual concerned parties. We should not forget that the AI system does not feel or care and thus should it render an unjust decision, it would not be conscious of it. Also, it does not provide reasons as to how it got to one or another verdict. Inherently it thus lacks the core value that would make it trustworthy in the eyes of the case parties — explainability.
AI explainability or explicability refers to a due diligence process that enables the concerned parties to ask for an explanation behind a machine learning decision that has a legal or other significant impact on them, and to potentially challenge t it. Albeit being a part of a general auditing obligation, this measure also entails the parties’ right to access, to an extent possible and reasonable, to the data used and information generated by the AI model.
In practice, it may not be always easy or even possible to unveil the reasoning bend a decision made by an AI model. Often, the prediction models of the highest level of explainability, such as the decision trees and classification rules, lack prediction accuracy, and vice versa — neural networks are typically very accurate, yet very opaque as to how they make the calculations. Nevertheless, even when using deep learning, the overall auditing of the AI model and ensuring as high a traceability of the decision making process as possible would provide some level of transparency and thus explicability.
Trainings in AI
The AI systems used for court decision-making are generally not developed or implemented by the courts themselves, but by an outsourced developer or vendor. The decision-makers, as well as the concerned parties thus often lack a knowledge and understanding of how the system works and based on what criteria it makes decisions. Furthermore, the AI system may not be deployed to completely replace the human judge and lawyers, but merely to complement them.
In either of the scenarios, it would be beneficial if the judges and the lawyers had a good — or at least some — understanding of the AI model used, the input variables and the predictive methods. It is neither feasible nor necessary to now train all lawyers and judges in neural networks. However, as a first step, we could focus on training to uncover the bias and dirty data in a regular decision-making so that the legal professionals become more conscious about the use of discriminatory language or fraudulent data. A higher bias awareness, combined with a basic understanding of the machine learning processes, its benefits and limitations, could help the parties involved make some sense of the information provided via the due diligence and improve the use of so-far imperfect AI models in decision making.
Systemic changes
Even when an in-depth due diligence can be run before a software is deployed as well as during its use in order to uncover and fix any potential bias, it would not be enough to render perfect, that is accurate and just decisions. The bias in the collected data has a great chance of reflecting the existing injustice and inequalities in the society. Unless a particular attention is paid to such cultural and social norms and stereotypes or wrong and immoral law enforcement practices and policies when creating the dataset, and unless these are purposefully rectified, we only risk further amplifying the bias that they carry.
Only if the processes of data collection, production and labelling and the use of the algorithm have well-defined rules, are understood and overseen can it be ensured that all the actors participating in the decision making, be it judge, lawyers, clerks or law enforcement authorities, strive for assuring the highest level of accuracy and fairness.
If we cannot deploy a bias-free system, or one that would give the parties sufficient reasons to believe it reached a fair decision, an intermediate solution could be to use AI systems complementary to human decision-making. That way, we can speed up the judicial process, analyse the case facts in a deeper way or save costs by at the same time protecting the justice. Just because a technology is available, it does not mean it should replace the existing policies.