Fairness and AI
Sandra Wachter on why fairness cannot be automated
How European Union non-discrimination laws are interpreted and enforced vary by context and by state definitions of key terms, like “gender” or “religion.” Non-discrimination laws become even more challenging to apply when discrimination — either direct or indirect discrimination — stems not from an individual or an organization but from algorithms’ training data. In some cases, for instance, people may not be aware of the discrimination because of the “black box” algorithms.
Sandra Wachter, a Faculty Associate at the Berkman Klein Center, Visiting Professor at Harvard Law School and Associate Professor and Senior Research Fellow in Law and Ethics of AI, Big Data, Robotics and Internet Regulation at the Oxford Internet Institute (OII) at the University of Oxford, joined BKC’s virtual luncheon series to discuss these issues and why fairness cannot be automated.
Although AI presents a challenge for non-discrimination laws, Wachter said, it should be viewed as “a feature not a bug.” The ideas Wachter shared are based on a recent paper, which was co-authored with Brent Mittelstadt of OII and Chris Russell of the University of Surrey.
The challenges implementing non-discrimination laws that involve algorithms formed the basis for “contextual equality,” a term Wachter and her colleagues coined in their paper to describe the importance of context in “often intuitive and ambiguous discrimination metrics and evidential requirements used by the European Court of Justice.”
After the event, Wachter spoke with the Berkman Klein Center to answer some questions from attendees.
Have you taken into account the definitions of ‘discrimination’ and ‘fairness’ as they are understood by AI researchers? Should it matter to lawyers?
In our paper, we analyzed the case law of the European Court of Justice and abstracted what the court considers to be “fair” based on its case law of nondiscrimination cases over recent decades.
In a second step, we compared the Court’s notion of fairness to the current technical fairness metrics that have been developed by the computer science community.
This was not an easy task, as the Court of Justice does not stick to consistent methods or metrics for assessing nondiscrimination cases. In fact, the case law and the European legislation embraces what we call “contextual equality.” Laws and case law are purposefully agile and fluid to offer appropriate legal responses in a constantly changing society.
Nonetheless, we have found that Conditional Demographic Disparity (CDD) is the technical fairness metric that represents the closest legal translation of the Court’s “gold standard” for assessing discrimination.
Which grounds of discrimination are the most vulnerable in the context of algorithms?
We know that data-driven decision-making has become commonplace both in the public and the private sectors. And we know that those systems can discriminate against people just as much as humans can.
However, the legal tools to prevent, investigate, and punish illegal discrimination cannot be easily translated to algorithms as discriminators.
Most cases are decided on common knowledge and intuition. For example, it is easy to see how less favorable employment laws for part-time workers, a ban on headscarves, or only granting benefits to married couples can lead to discrimination based on gender, religion, or sexual orientation. No numbers or complex statistical metrics are needed to see a potential conflict.
This type of intuition and social understanding are diminished when algorithms are the discriminators. AI can use unintuitive proxy data to make decisions about humans that do not make our ‘alarm bells’ ring.
Similarly, claimants might not make use of their legal remedies, simply because they are not aware that they are being discriminated against in the first place. For example, we might not know that we are not being shown a job advertisement, we are seeing a higher price for a product, or that we have been filtered out from seeing certain search results.
This means everyone is at a higher risk of being discriminated against and is less likely to receive effective protection.
However, we show in the paper that specific groups are especially vulnerable. The wrong choice of fairness metrics could be particularly detrimental for smaller groups, such as minority groups such as the LGBT* community, various religions, many races and ethnicities, and especially intersectional groups.
Conditional demographic disparity (CDD) can help identify discrimination against such groups because unlike other tests it is not dependent on the actual size of the group, but rather their proportional treatment relative to others.
Given that the development of AI-based solutions is a very important goal for both companies and some governments, should the problem of lack of fairness require an update of the regulations in force today (like GDPR and others) or a new kind of legal approach? Or is a problem that can still be solved only on a case-by-case basis?
I think it has to be a mixture of both. On the one hand, we need to establish standards for ethical algorithmic decision-making, which we currently don’t have. On the other hand, we want to maintain flexibility and allow the courts to assess fairness on a case-by-case basis. I am currently working on a research project which is based on the findings of my paper A Right to Reasonable Inferences that concluded that current data protection standards are insufficient to protect us from the inferential power of algorithms. The aim of the project is to assess how we need to conceptualize laws — such as data protection law and non-discrimination law — when algorithms and not humans make decisions about us. It is crucial to figure out whether the law that we have in place will protect us just as much when AI systems, not humans, are assessing us.
Is it realistic to develop a “fairness by design” protection in AI developments?
Yes, I think this is possible, but it’s crucial that the legal and technical communities work closely together. The tech community should embrace that “contextual equality” is a feature of EU non-discrimination law, not a bug that needs fixing. And the legal community will see that, when assessing data-driven cases, intuition alone might not be sufficient to detect discrimination; rather, new coherent strategies can be helpful.
This is not to say that we only or always should rely on statistical evidence. The case law and the courts in the EU encourage and promote a rich and diverse evidence base. This diversity is important to maintain in the future. However, if statistics are being used, it is crucial that cases do not deteriorate into a ‘battle of numbers.’
We envision the creation of summary statistics that level the playing field for all parties involved. We see Conditional Demographic Disparity (CDD) as a treasure map that shows you where to look, but not what to think. It warns you of the dangers and removes the blinders of intuition without proposing a specific legal interpretation, decision or intervention.
CDD used as a statistical baseline measure will be of service to judges, regulators, industry, and claimants. Judges will have a first frame to investigate prima facie discrimination, regulators will have a baseline for investigating potential discrimination cases, industry can prevent discrimination or refute potential claims, and victims will have a reliable measure of potential discrimination to raise claims.
CDD can, on the one hand, allow for coherent assessment, and on the other hand, allow for agile and contextual interpretation. This brings both disciplines closer together by learning from each other’s strengths.
Do we need transparency laws for AI and algorithms that outweigh intellectual property laws so that those AI implementations of their algorithms become accessible by the user, judges, lawyers, etc.?
This is again a question of contextuality. In some instances, the potential harm can be so great that intellectual property rights and trade secrets do not outweigh the public interest in investigating or preventing this harm. And of course, we can think of many applications that do not pose a high risk for individuals, groups, or society as a whole in which case economic considerations can outweigh others. And in the same vein, the level of transparency required and the audience that needs access will also depend on the circumstances, the application, the sector, and the parties involved.
Wachter’s talk was part of Berkman Klein’s AI Policy Practice.