Challenges of Explainable AI
Artificial Intelligence (AI) is everywhere, with applications ranging from medical diagnosis to autonomous driving. As use of AI and Machine Learning (ML) becomes increasingly common across industries and functions, interdisciplinary stakeholders are searching for ways to understand the systems they are using so that they can trust the decisions such systems inform. This effort is sometimes referred to as “Explainable AI” or “XAI”.
Is there really a need for XAI?
The growth of the current XAI movement is, in part, the result of the proliferation of AI. State-of-the-art AI and ML systems, such as deep learning tools, are highly complex and inherently challenging to understand. This complexity is driving the investment in new research into XAI.(1)
At the same time, individuals and businesses who are not AI experts are able to deploy parts of the AI and ML toolkit without needing to have a substantive understanding of the mathematical processes behind the technology.(2) “Citizen data scientists” have democratized data science and machine learning, and this highlights the need for human oversight. It also emphasizes a challenge for “explaining AI” — even if AI can be made to “explain” itself to an expert, such an explanation may be largely incomprehensible to the citizen data scientist.(3,4)
The focus on trust and understanding that is driving the XAI movement relates to important questions of law and policy. An explanation for an AI or ML system can put the system’s reasoning into the open for debate about whether it is equitable or just, or may enable some sort of actionable understanding around why a decision was made.(5) Of course, such understanding is required by law in certain contexts. For example, in the United States, creditors must provide applicants with a “statement of specific reasons” for “adverse action.” The European Union’s General Data Protection Regulation (GDPR) discusses a “[data subject’s] right to…obtain an explanation of the decision reached…and to challenge the decision” in the similar context of automated decision-making (Article 22, Recital 71 GDPR). The conversation around these mandates may also shape future legislation in the United States and beyond.
Is everyone on board with XAI? No.
Some researchers, like Facebook’s Chief AI Scientist Yann Lecun and Google Brain’s Geoff Hinton, have argued that asking systems to “explain” themselves is a complex, infeasible task that may not lead to actionable insight.(6,7) Others disagree, arguing that explainability is necessary, as technologists need to consider the social implications of all parts of their AI systems.(8,9,10) Moreover, they argue, evolving research may make the task increasingly feasible.(11)
One of the key challenges is that the field lacks a common language — a reflection of important debates about the meaning of terms like “explainability” and “interpretability.”(12,13,14) Many researchers regard “interpretability” as the comprehensibility of a particular model or model output and view “explanations” as more encompassing.(13,14,15,16) In this view, interpretations are concerned with comprehending what the rules of a specific system are, while explanations may include both interpretations of the system as well as justifications of why the rules are what they are.(13,14,16,17) Others define the terms differently or use the two terms synonymously.(12)
The challenge of interpretation
To understand why interpretations for complex models are hard to generate, its useful to know what can make a model complex. Within the machine learning community, there are different modes of defining and quantifying a model’s complexity. Many of these modes generally involve some consideration of the model’s behavior.(18) Here, we focus on complexity viewed through an analysis of the rules the model follows.(5,19)
One way to analyze the rules a model follows (and thus its complexity) is by considering linearity, monotonicity, continuity, and dimensionality.(5)
As Figure 1 demonstrates, a linear model has a constant slope, while a non-linear model does not. Linearity lends itself to interpretability because, for each unit change in the inputs, the output will change by a constant amount.(20) A monotonic function may have a slope that is always negative or zero, meaning it is always non-increasing, or it may have a slope that is always positive or zero, meaning it is always non-decreasing. Monotonicity guarantees that for a change in the input in a given direction, the output will always change in only one direction.(19) A continuous function is one without any “breaks” in the curve; discontinuous functions may be hard to interpret because there may be jumps or breaks in the curve.(5)
A given dataset may contain some number n of observations and some number p of features (or variables) available for each observation.(20) Dimensionality can refer to the number of variables represented in a model.(5,21) Models or datasets with low dimensionality can be represented visually to human viewers in two or three-dimensional space, as the models in the figure are. Models in high dimensions are generally more complex than those in low dimensions and not easily visualizable for human interpretation.(18) While there are many techniques meant to present high-dimensional datasets or models to humans by compressing them into low-dimensional representations, these representations may require significant background knowledge to comprehend.(27)
Trying to quantify model complexity is just that — complex.
However, taken together, linearity, monotonicity, continuity, and dimensionality express some of the important characteristics of the issue.(5) While linear, monotonic, continuous functions in low-dimensional space are generally highly interpretable, combinations of non-linearity, non-monotonicity, discontinuity, and high dimensionality generally introduce increasing complexity and pose challenges to interpretability. For example, neural networks, which can be quite complex, generally model outputs as nonlinear functions of the inputs, often in high dimensionality.(18) The figure below offers a notional, non-exhaustive characterization of the complexity of various popular modeling and machine learning techniques.
This figure is by no means definitive. For example, while linear regression is characterized as “less complex,” a linear model with hundreds or thousands of included variables would be quite complex.
One helpful way to characterize efforts in XAI is by applicability — for example, whether a technique can be used to interpret or justify a single model or many, or whether it can be used to interpret or justify a single decision or larger trends.
- Model-agnostic approaches, which treat the internal workings of the model as an unknown black box, can generally be applied to entire classes of algorithms or learning techniques.(19,22)
- Model-specific approaches can only be used for specific techniques or narrow classes of techniques.(19) Model-specific approaches that dive into the internal workings of the model are also referred to as white box techniques.
When looking at a model, local interpretations focus on specific datapoints, while global interpretations are focused on more general patterns across all datapoints.(12,13) Local interpretations are often more relevant for analyzing specific outcomes or decisions, and global interpretations are often more useful for overall monitoring and pattern detection. The table below, focused on the intersection between each of these approaches, identifies examples of explainability techniques.
Key efforts generating interpretations and explanations
Key efforts in interpretability and explainability research include designing simpler models, building approximate models, and analyzing model inputs.
At a high level, designing for simplicity is focused on finding ways to build and use relatively simple, accurate models that are naturally interpretable.(5,26) Post-hoc methods generate approximate models to help explain the complex model used for the actual task.(13,25) Given a specific decision or type of decisions, analyzing model inputs is focused on determining how different variables or concepts influenced that decision.(5,7) Interacting with and visualizing data and models is an important resource across each of these three efforts.
While these approaches are not an exhaustive characterization of XAI, they provide a view into key lines of thinking in the area. Each approach has strengths and limitations, and there are a variety of emerging and established techniques in each area.
What does this all mean?
While most new work and research on these techniques is coming from the academic sector, XAI tools are beginning to materialize in the market. Whether XAI companies will be able to stand on their own, or if these tools will primarily be absorbed as a feature by established AI/ML players, remains to be seen.
Resources for Explainable AI
Open Source XAI Platforms:
- IBM’s Fairness 360
- Microsoft’s Interpretability packages
- Google’s “What If Tool”
- H2O.ai’s H2O Platform
- Oracle’s Skater
- ACM FAT*: ACM Conference on Fairness, Accountability and Transparency (fatconference.org)
- NeurIPS: Neural Information Processing Systems (nips.cc)
- CVPR: IEEE Conference on Computer Vision and Pattern Recognition (cvpr2019.thecvf.com)
- ICML: International Conference on Machine Learning (icml.cc)
- SIGKDD: ACM SIGKDD International Conference on Knowledge discovery and data mining (kdd.org)
- AAAI: AAAI Conference on Artificial Intelligence (aaai.org/Conferences/AAAI)
1. Gunning, D. (2016). Explainable Artificial Intelligence (XAI). Defense Advanced Research Projects Agency. Retrieved from https://www.darpa.mil/program/explainable-artificial-intelligence
2. Idoine, C. (2018). Citizen Data Scientists and Why They Matter. Gartner Blog Network. Retrieved from https://blogs.gartner.com/carlie-idoine/2018/05/13/citizen-data-scientists-and-why-they-matter/
3. Burrell, J. (2016). How the machine ‘thinks’: Understanding opacity in machine learning algorithms. Big Data & Society, 3(1), 2053951715622512.
4. Miller, T., Howe, P., & Sonenberg, L. (2017). Explainable AI: Beware of inmates running the asylum or: How I learnt to stop worrying and love the social and behavioural sciences. arXiv preprint arXiv:1712.00547.
5. Selbst, A. D., & Barocas, S. (2018). The intuitive appeal of explainable machines. Fordham L. Rev., 87, 1085.
6. Simonite, T. (2018). Google’s AI Guru Wants Computers to Think More Like Brains. WIRED. Retrieved from https://www.wired.com/story/googles-ai-guru-computers-think-more-like-brains/
7. Lecun, Y. (2017). Panel Debate at the Interpretable ML Symposium. 2017 Conference on Neural Information Processing Systems.
8. Caruana, R. (2017). Panel Debate at the Interpretable ML Symposium. 2017 Conference on Neural Information Processing Systems.
9. Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., & Elhadad, N. (2015, August). Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1721–1730). ACM.
10. Jones, H. (2018). Goeff Hinton Dismissed the Need for Explainable AI: 8 Experts Explain Why He’s Wrong. Forbes. Retrieved from https://www.forbes.com/sites/cognitiveworld/2018/12/20/geoff-hinton-dismissed-the-need-for-explainable-ai-8-experts-explain-why-hes-wrong/#246ccdb1756d
11. Gunning, D. (2017). Explainable Artificial Intelligence (XAI). Program update. Defense Advanced Research Projects Agency. Retrieved from https://www.darpa.mil/attachments/XAIProgramUpdate.pdf
12. Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.
13. Mittelstadt, B., Russell, C., & Wachter, S. (2018). Explaining explanations in AI. arXiv preprint arXiv:1811.01439.
14. Miller, T. (2018). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence.
15. Lipton, Z. C. (2016). The mythos of model interpretability. arXiv preprint arXiv:1606.03490.
16. Lisboa, P. J. (2013, November). Interpretability in machine learning–principles and practice. In International Workshop on Fuzzy Logic and Applications (pp. 15–21). Springer, Cham.
17. Doshi-Velez, F., Kortz, M., Budish, R., Bavitz, C., Gershman, S., O’Brien, D., Schieber, S., Waldo, J., Weinberger, D. and Wood, A. (2017). Accountability of AI under the law: The role of explanation. arXiv preprint arXiv:1711.01134.
18. Friedman, J., Hastie, T., & Tibshirani, R. (2001). The elements of statistical learning (Vol. 1, №10). New York: Springer series in statistics.
19. Hall, P., & Gill, N. (2018). Introduction to Machine Learning Interpretability. O’Reilly Media, Incorporated.
20. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112, p. 18). New York: springer.
21. Breiman, L. (2001). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science, 16(3), 199–231.
22. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386.
23. Kim, B., Wattenberg, M., Gilmer, J., Cai, C., Wexler, J., Viegas, F., & Sayres, R. (2017). Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). arXiv preprint arXiv:1711.11279.
24. Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems (pp. 4765–4774).
25. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016, August). Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135–1144). ACM.
26. Rudin, C. (2018). Please stop explaining black box models for high stakes decisions. arXiv preprint arXiv:1811.10154.
27. Smith, L. I. (2002). A tutorial on principal components analysis.