Explainable AI, Explained
The value of AI lies in its ability to assist humans in making decisions by speeding information gathering and pattern recognition. AI outputs need to be explainable in order to be trusted and carefully considered. This requirement calls for a decision-oriented definition of explainable AI, comprised of three key elements: the decision task; the AI-generated explanation; and an often-overlooked aspect, the “hidden” human knowledge unavailable to the AI system.
The decision task is any task in which a human must choose an action (or no action).
The explanation could be any AI output, such as the model’s prediction, a visualization of key factors affecting its prediction, or a natural language justification. Importantly, the explanation must be made available to a human in some form, usually visualized on a computer screen.
The last element, human knowledge unavailable to the AI system, is a critical component of explainable AI. It means there must be something that a human knows or understands about the world or the current situation that enables the person to make a better decision than a completely automated system could make. If there is no hidden human knowledge, the decision could be completely automated, rendering AI explanations unnecessary.
A good example is medical diagnosis. The decision task is to diagnose a patient’s disease correctly. The explanation could be a list of most likely diagnoses, along with symptoms that point to each of these diagnoses. The human knowledge is the doctor’s understanding of medical conditions, and realization if a symptom does not make sense for the diagnosis. Incorporating this medical understanding, the doctor either accepts the predictions or recommendations or rejects them if they don’t align with the practitioner’s medical knowledge.
When AI goes wrong
Most people intuitively recognize that entirely automated decisions likely could go wrong, which is particularly dangerous in situations with significant real-world consequences. Ultimately, then, an explainable AI system somehow must integrate a human to enable better decisions.
The main applications of explainable AI’s ability to mitigate any potential problems caused by automated decision-making fall under the categories of robustness and social concerns.
As for robustness, one major issue with current AI systems is that they can fail in unexpected and catastrophic ways when the environment or context changes even slightly. Thus, explainable AI helps a human recognize and correct these failures to enable better decisions than those made using AI alone.
The other major issues are social matters, such as discrimination, lack of fairness, or lack of accountability. In these cases, either social norms (such as fairness) or legal factors (such as anti-discrimination laws) imply constraints on the system that can be difficult to formalize or enforce, as they inherently are social constructs.
In these cases, a human decision-maker or judge can carefully consider both the AI-generated explanation and the social understanding — which is difficult to formalize mathematically — and aid the person in making decisions that align with societal or legal values.
To trust or not to trust
Most AI methods are black boxes that are too complicated even for experts to fully understand — perhaps similar to the challenge of truly comprehending the human mind.
Without explainability, people may fall into one of two extremes when using AI systems. On the one side, without explanations, humans may never trust AI systems so decide to completely ignore them. On the other hand, also without explanations, people may begin to implicitly trust AI systems more than they should, expecting AI systems to magically understand everything about the world and speak the truth.
For low-risk situations, such as those involving writing help or creativity assistance, the first problem can be mitigated with simple explanations. For high-risk situations, like needs for medical diagnoses or bail determinations, explainability can be critical for making correct decisions, and failing to analyze in such instances could result in severe consequences, such as medical fatalities or unfair judicial verdicts.
Ultimately, explanations allow people to combine their knowledge and reasoning with the reasoning of the AI system for a better overall outcome. Relying exclusively on either the human or AI often is suboptimal in many circumstances.
It’s thus vital that we do better at systematically defining what is meant by explainability, and how to measure progress toward this goal. The key lies in answering two questions: 1. What does the human have that the AI system lacks? 2. How can we effectively combine human and AI information sources for better decisions?
I do not think we can ever fully explain AI technology, nor do I think that should be the aim — just as I expect we will never be able to completely explain the inner workings of the human mind. Even so, by explaining some aspects of how humans process information, we can better understand people’s decisions and behaviors — knowing when to trust or not trust them.
This is analogous to explainable AI. Most AI systems will always be too complex to entirely explain their behavior. However, by understanding them better, we can know their limitations and abilities.
Data shifts, uncertainty looms
Our current research focus is on explaining distribution shifts and uncertainty quantification to better explain AI.
In the first case, we are seeking to provide machine learning (ML) operators with an explanation of how two datasets differ from each other (i.e., the distribution shift), to aid in identifying potential problems in the ML pipeline.
Distribution shift occurs in ML when the data used to train a model varies significantly from the data the model encounters in real-world applications. This mismatch can lead to poor performance, as the model’s assumptions about the data no longer hold true. For example, a self-driving car trained on sunny California roads may struggle in snowy Michigan due to a change in the distribution of weather conditions.
Addressing distribution shift is crucial to building robust and reliable ML models that perform well across diverse environments. Our explanations can help an ML practitioner better understand these distribution shifts and select the appropriate course of action.
We also are investigating ways to explain an AI system prediction to an operator by quantifying the uncertainty due to missing values. In many applications, values may be absent because of privacy concerns, data errors, or measurement mistakes.
For example, in a sensor network that is monitoring a remote location, sensors may fail due to power limits or harsh weather, which would result in missing sensor values. In a security application whose goal is to protect a certain region from intrusion, a security officer would need to decide if a detected movement is a real threat (e.g., a thief or person intent on vandalism) or if the movement merely involves a wildcat moving through the area.
While an AI system could predict its best guess, missing values may significantly impair ability to predict accurately. Therefore, we aim to also give the security officer an estimate of the uncertainty due to missing values.
If there is no uncertainty, the security officer can trust the prediction and act accordingly (e.g., send someone to investigate). If there is high uncertainty, the security officer may choose to deploy a new sensor (e.g., a drone) to check further, in order to reduce the unsureness before deciding what to do. Thus, the explanation can help the officer make more optimal decisions.
Looking ahead
In the future, I envision we will gain a much better handle on what AI can do and what humans can do as the two enhance their interactions. Explainability then will be the main interface between AI and people to enable seamless collaboration.
The dichotomy of what AI and humans provide to a task is not primarily a question of feasibility but rather of cost. While it is far easier for a computer to analyze thousands of data sources in seconds, it is much simpler for humans to quickly validate the results using their understanding about the context and the world. We already are seeing this with the explosion of large language models like ChatGPT, which produces immediate results that are not necessarily truthful, while a person can easily validate such results because the model is in natural language.
My contention is that only humans can define a task or problem based on human values. In the past, people defined more narrow tasks for computers to complete. Humans are now defining increasingly broader and more complex tasks for large language models to handle.
Ultimately, the task or goal definition will be provided by the person — potentially at higher and higher levels of abstraction as AI improves its task-solving abilities.
David I. Inouye, PhD
Assistant Professor, Elmore Family School of Electrical and Computer Engineering
Probabilistic and Understandable Machine Learning Lab
Faculty Member, Institute for Control, Optimization and Networks (ICON)
College of Engineering
Faculty Member, Center for Resilient Infrastructures, Systems, and Processes (CRISP)
Faculty Member, Center for Education and Research in Information Assurance and Security (CERIAS)
Purdue University