Machine Learning and The Law
When machines use predictions to make decisions that can impact the rest of our lives, we are entitled to explanations of how they work and how predictive models were created (including the data that was used).
Last week I went to the workshops at NIPS (biggest ML conference in the world) and I also attended part of the ML and the Law symposium the day before. I found out a little bit too late about the symposia but I was still able to attend two panels on which there were both lawyers and computer scientists. They were very insightful and informative — did you know that this Spring, the European Union passed a regulation giving its citizens a “right to an explanation” for decisions made by machine-learning systems?
Below is the rest of my notes from the symposium…
Auditing Machine Learning / Artificial Intelligence
The panel discussions were motivated by the problem of explaining ML-powered decisions which have an important impact on people’s lives:
- rejection of school / job / loan application
- pricing of healthcare or home insurance
- medical treatment you’ll have to take for the rest of your life
- denial of parole
“Advances in ML and AI mean that predictions and decisions of algorithms are already in use in many important situations under legal or regulatory control, and this is likely to increase dramatically in the near future” (ML & the Law workshop website)
We need to be able to test how systems get to their conclusions; if we can’t test, we can’t contest. Individuals are entitled to know which data is being processed of them, and to explanations of how predictions & decisions work, in terms they can understand. On this topic, panelists agreed that we need to be clearer about what a correct explanation is and what it should contain. (Maybe this notion changes from one domain to the other. Do we need to have explanations in natural language, or with a certain form / structure?)
The promise of AI (and ML which is part of it) is to be enable better decisions. We humans have many cognitive biases that can make us poor decision makers in many situations. But if the entire society is to depend on ML, we all have to learn to understand how it works (lawyers, but also the general public). Lawyers need to understand the methodology and how computer science works. In the context of machine learning, that means understanding the data, how it’s collected, and the biases that may be laying in the data. Everyone should know what ML allows to do and what are its limitations. Models learnt from data may be biased, because the data collection was biased (e.g. data on credit defaults which only contain credits that were approved by some process). ML can be used in ways that are completely nonsensical: for instance, if you train a model on pictures of dogs and cats, and then you show a plant, the model will see a dog or a cat!
We should be able to scrutinize algorithms that learn models from data, the data itself (from its collection to its preparation in ML-ready form — a.k.a. “featurization”), and the algorithms that make predictions and decisions from a given model. (However, it may not be that clear how we can audit what happens behind Intellectual Property walls.)
Why simple decision trees beat advanced deep learning
If we are to use ML to improve many aspects of our lives, we need models to be interpretable so we can fully trust them.
Even though the overall accuracy of models can be globally lower, decision tree learning is preferred to deep neural networks learning because the models that are generated are much easier for anyone to understand. Beyond the traditional algorithms such as CART, there has been work to provide even more interpretable decisions trees (e.g. Scalable Bayesian Rule Lists presented at PAPIs ’16) or to improve their performance and stability in certain families of use cases (e.g. Craft AI presented at APIdays).
There are actually many different approaches to interpretability of models, and there was a whole workshop on that topic at NIPS (Interpretable Machine Learning for Complex Systems). There was also a great talk entitled “Why should I trust you” by Prof. Carlos Guestrin of Apple and University of Washington at the AI for Data Science workshop, which I attended and will write about in my next post. One of the points from that talk is that we don’t just want to select accurate models, but models that make sense to us!
We want to use both our understanding of the process behind the creation of a particular ML model, and our understanding of the functioning of that model, to make sure that decisions based on the model are fair and that they don’t discriminate people. There should be sufficient transparency and clarity regarding which information is considered and the rules for using it (e.g. how it’s weighted in decisions).
Creating better social policy
Meetings like the ML & Law symposium at NIPS help bring together the legal and tech communities. It’s awesome to have lawyers come to an ML conference, to bridge the gap with their profession. Human nature is still such that lawyers all sat together on one side of the panel and computer scientists on the other side, but fortunately the moderator noticed it and “shuffled” the panelists :)
Another interesting series of events is the FAT ML workshops (Fairness, Accountability and Transparency in Machine Learning). The panelists at NIPS mentioned that it’s difficult to get a complete overview of what’s going on in this area of research, so there’s definitely room for these events to grow… Also, they mentioned that different communities use inconsistent terminology (which slows progress down). There’s a need for a common language, which may emerge from meetings like these.
Beyond such meetings, it may not be easy to have lawyers and technologists sit down and make progress together towards the objectives listed above. An incentive for that may be to fine organizations that fail to provide explanations of their AI-powered decisions.
In any case, it’s important for machine learners and social policy makers to work together. Tools like the Machine Learning Canvas can help them communicate on ML systems and on their key aspects: data sources, collection, making predictions, turning them into decisions, etc.
Follow me for more articles from my notes at NIPS (ML in the wild and AI for Data Science workshops)!