What is FATML and why should you care

Published in

Compendium

7 min readFeb 2, 2019

At the time of this writing, I’m sitting on a plane on my way to ACM FAT* 2019 conference in Atlanta, Georgia. This is the second time this conference is held, and aims to bring together researchers and practitioners interested in fairness, accountability, and transparency in socio-technical systems. This area of machine learning is known as FATML. So in what follows I’ll give you a short introduction to the FATML principles and argue why it is important for a sustainable future.

We are headed towards a society where decisions are made digital, that is, there is an algorithm behind the scenes making choices for us. In fact, we are already there. If you need to search for something on the internet, you google it, right? And a list of results is magically presented before you to choose. What about yesterday’s movie you saw in Netflix via a recommendation? Or the next post you read in Instagram? Do you even think about how those choices were generated for you? And should you trust them blindly? We have seen cases in the last years which proves how these algorithms affect your behavior in unintended (and sometimes intended) ways. They can make you feel good or bad, alter how you view the world to shift your voting opinion or change your consumption habits to get you to buy things you don’t desire.

The case for human decision making

So this doesn’t sound good, right? What about instead of trusting algorithms we trust the human mind and decision making? Well, in “A survey of judges” [2], a group of american researchers in the 1970s, tried to answer this question. They created a setup of hypothetical cases and independently asked 47 Virginia state district court judges how would they deal with each.

Here is one of them:

An 18-year old female defendant was apprehended for possession of marijuana, arrested with her boyfriend and seven other acquaintances. There was evidence of a substantial amount of smoked and unsmoked marijuana found, but no marijuana was discovered directly on her possession. She had no previous criminal record, was a good student from a middle-class home and was neither rebellious nor apologetic for her actions.

The incredible part is the variety of the sentences. Of the 47 judges, 29 said not guilty and 18 declared her guilty. Of those who opted for the guilty verdict, 8 recommended probation, 4 wanted a fine, 3 said probation with a fine, and 3 agreed on prison. So if this was several runs of the same digital algorithm, I dare you find someone crazy enough to put this in production! But the researchers went a bit further, and among the 41 hypothetical cases given to the judges, there were 7 that appeared twice — with the names of the defendants changed. Most judges didn’t manage to make the same decision on the same case when seeing it the second time. What this shows is that humans are not very reliable at being impartial or unbiased, or even making the same choice given the same evidence.

The case for digital decisions

So one advantage of having algorithms making decisions, is that, it is easier to get the same decision given the same evidence. In that way, we could at least introduce predictability in the decision process. But down this path, there are several other perils ahead:

Your algorithm isn’t fair and just. And you may think you know what fairness is, but really, here are 21 definitions of fairness and this article makes a great job on grouping the definitions and simplifying the criteria. On a high-level, you can think that the score of the algorithm should mean the same for the different demographic groups (i.e. groups by age, sex, race, religion, etc).
Not a clear path for redress or lacking accountability. When a decision is taken algorithmically, it should be as easy to challenge the decision, and give an explanation of the result.
Lack of transparency, when we blindly trust an algorithm making a decision. How would you know if the creators of the algorithm threw away two-thirds of the historical records to produce a formula that doesn’t reflect reality?

A consensus is emerging about what should we require from these algorithms, and guidelines that we can follow when designing such algorithms in order to trust them. I discuss a few here:

Responsibility addresses avenues of redress for adverse individual or societal effects of the system. It also demands to designate an internal role for the person who is responsible for the timely remedy of such issues.
Accountability is the principle that a person who is legally or politically responsible for harm has to provide some sort of compensation or justification. Note that, in order to be accountable, you must have a degree of control, in the sense that you are in a position to cause the harm or can prevent it.
Explainability means that the decisions can be explained to end-users and other stakeholders in non-technical terms.
Fairness means that the decisions do not create discriminatory or unjust impact on the end users when taking into account different demographics.

If we are to scale human decision making algorithmically, we need to understand what we are doing. Fairness is a hard problem, and one of the reasons is that there are no clear best practices yet on how to go from your moral requirements of fairness to the correct metric to use, or that everything seems to be context dependent. We can’t have the narrow view that fairness can be solved mathematically. Here, a mixture of disciplines from ethics, law, socio economics, politics, mathematics and computer science must join forces. Just communicating between these branches is a hard enough problem. But if we get this right, we will be able to make systems that don’t discriminate or generate decisions with unfair outcomes. And that is why I care about FATML, and why you need to care about it too.

So I am here at ACM FAT* 2019, in order to gain a better understanding of the challenges ahead within FATML. Traveling together with me, are Josephine and Denise, who are taking their M.Sc. at the IT University of Copenhagen (ITU) in Denmark where they are part of the newly formed research group within algorithmic fairness (𝛼@ITU). They are about to write their master thesis about fairness in machine learning in collaboration with Computas, my current employer, and with me as their industry advisor. Their participation in the conference is sponsored by Computas and we hope to get inspired by the field, and hopefully come back with new and better research ideas within this exciting field.

Computas already delivers solutions that are mission critical for the public and private sector in Norway. We are expanding into Denmark and our subsidiary Computas Denmark is the main driving motor behind this collaboration with Josephine and Denise from ITU. With this collaboration, we aim at strengthening our commitment to deliver machine learning solutions within the principles of fairness, accountability and transparency, and build a better and sustainable future.

Acknowledgments

I would like to thank Josephine, Denise, and Ingrid Ruud for their helpful comments on the draft of this blog post.

Disclaimer

Computas is the author’s current employer. The views, thoughts, and opinions expressed in the text belong solely to the author, and not necessarily to the author’s employer, organization, committee or other group or individual.

References

[1] Robert Epstein and Ronald E. Robertson, ‘The search engine manipulation effect (SEME) and its possible impact on the outcomes of elections’, Proceedings of the National Academy of Sciences , vol. 112, no. 33, 2015, pp. E4512–21, http://www.pnas.org/content/112/33/E4512 .

[2] William Austin and Thomas A. Williams III, ‘A survey of judges’ responses to simulated legal cases: research note on sentencing disparity’, Journal of Criminal Law and Criminology , vol. 68, no. 2, 1977, pp. 306–310.