UK GPDR Watchdog: Explain Your AI
The UK’s Data Protection Authority just issued much-anticipated guidance that clarifies the complicated issue of the GDPR’s ‘right to explanation’. Here is some background on the issue and what the new information means.
Ever since the enactment of the GDPR, many in the AI industry have been quietly worrying about the enforcement of some of the policy’s lesser-known provisions. EU governments have been hard at work extracting massive fines from businesses left and right for their privacy practices, where lawmakers have thus far earned more than €335 million from companies within the last two years. The majority of fines have come from the abuse and accidental exposure of personal information.
Many questions around the enforcement of the GDPR remain. One in particular, which could result in fines as high as 4% of a company’s annual revenue, stands as a proverbial boogeyman to AI practitioners everywhere:
What, exactly, does a “Right to Explanation” mean?
Simply put, this provision of the GDPR requires that companies “ensure fair and transparent processing” by providing users access to “meaningful information about the logic involved” within automated decisions that impact their lives. This may once have been an easy task to broadly outline the steps taken in a deterministic algorithm. But most modern AI, and especially black-box techniques, cannot be similarly explained. Given the complexity of the math needed to arrive at a model’s prediction, no human mind could possibly understand the rationale of the machine’s decision, leaving developers and compliance officers similarly vexed at how to protect themselves against prosecutors’ prying eyes.
While this provision has been broadly considered quite vague, enforcement of the policy was unclear and hotly debated in the academic community. Some scholars argued that the point was moot and the right to explanation would never materialize. Others claimed that broad, model-level explanations (e.g. model cards or an account of the methodology undertaken to train and test the model) would suffice.
Given the complexity of ML decisioning, model-level explanations are absolutely necessary to describe the intended use, context, and the impact of these complex systems. But, given the reality of the behavior of these models and the presence of outliers for every observable rule, global explanations could in no way ever be enough to allow users to effectively challenge automated decisions made by “black box” algorithms.
Yesterday’s new guidance from the UK’s data watchdog (the ICO) dramatically changes this landscape, providing welcome clarity on how businesses need to think about transparency in their use of AI models.
The ICO’s new guidance formally requires companies to provide comprehensive model-level and inference-level explanations, even when using “black box” techniques.
In part 2 of the document, Explaining AI in practice, the ICO makes clear that even in cases where “black box” models are the only suitable models to use (as in image recognition), individual inference-level explanations must still be made available to consumers. They outline many of the techniques that can be used to approximate explanations, including the use of proxy-models like LIME or SHAP to reverse engineer feature importance for a particular machine-made decision, while also outlining the drawbacks from relying on this methodology alone.
This is great news for the public, because the spirit of the right to explanation is to allow consumers to challenge decisions that directly impact their lives. Ask anyone who’s trained a model using modern AI techniques and they’ll tell you that weird things happen around the decision boundaries of a data model. Importantly, these oddities are not always relegated to the behavior of a small number of outlier cases, and may simply be the result of an oversight. These “oddities” can have a dramatic impact on large swaths of the population, even when intentions are good.
Importantly, the ICO’s guidelines lay out the need for deep and meaningful explanations, and in doing so create an incredible model for how to think about governing AI that lawmakers around the world should notice. They detail six major categories of explanations, all of which need to be taken into account at a global (model) and local (per-inference) level. In order to comply with the GDPR according to the ICO, businesses will need explanations of their model building process and explanations of the outcomes for the model broadly, along with an explanation for how the machine arrived at any particular consumer decision on the following dimensions:
- Safety and performance
Without taking the time to define the provisions of each explanation type, suffice it to say that when each of the above categories are satisfied, the business in question will have quite a lot of documentation for each model they build. For any model in production, companies will need to know and explain (among other things) who trained it, using which methodology (and why this method was chosen over others), where the data came from, how it was balanced for demographic fairness, how the outcomes are tracked to ward against discriminatory practices, what the accuracy metrics were, how those are updated over time, how the testing and validation took place, how the model’s security and robustness are addressed (including its resistance to adversarial attacks), and to provide an impact explanation that seems to require enterprises to study the broader societal impact of the use of their model in practical terms. And, for each explanation type, the guidelines also stipulate that companies must document their approach to ongoing, proactive monitoring for all of the explanation types otherwise mandated by the law.
All in all, the ICO’s new guidelines set reasonable, achievable, (and strict!) requirements for any company using AI who is subject to the GDPR’s reach.
Their approach is comprehensive, and seems well-informed by industry practitioners and academics studying AI. The field of explainability in AI is a developing one, and the policy guidance takes this into account, mapping out the best techniques available to the industry today without limiting itself to one such strategy that might become less interesting as new advances are made. It should act as a model for any company using AI, regardless of whether the GDPR impacts business from day to day. The guidance is open to comment through January 24, 2020, and presumably will become official policy shortly thereafter in 2020.