Explainable Artificial Intelligence: Technical Perspective — Part 1

4 min readAug 14, 2020

Explainable AI is one of the hottest topics in the field of Machine Learning. Machine learning models are often thought of as uninterpretable black boxes. Ultimately, these models are used by humans who need to trust them, comprehend the errors they make, and the rationale behind their predictions. In this article, we will discuss challenges addressing explainable AI, current and emerging methods for understanding black box models.

Before I proceed further into the technical aspects of Explainable AI, I suggest you to explore my blog on the business perspective of Explainable AI.

Challenges:

One of the most prominent themes of Neural Information Processing Systems 2017, against a backdrop of remarkable progress in AI on many discipline, was the perception of machine learning systems as black boxes: closed systems that receive inputs, generate output and give no clue as to why.

We want AI systems to produce outcomes with excellent efficiency with transparency explanations for decisions they make. For example, models like decision trees, linear models, Bayesian models have certain amount of traceability and transparency in their decision making process without sacrificing too much performance or accuracy. Conversely, powerful models like neural networks and ensemble models sacrifice transparency and explainability for power, performance and accuracy.

Below figure illustrates some of state-of-the-art techniques in the accuracy/ interpretability trade-off map, where tools such as decisions trees have more explanation power than neural networks.

XAI methods attempts to provide an efficient trade-off between accuracy and interpretability along with a great human-computer interface which can help translate the model to be an understandable representation for the end users.

Explainable Machine Learning Approaches:

Researchers are looking for strategies to retroactively “bolt on” solutions to predictions of machine learning models, which has proven to be very challenging. XAI techniques can be divided into ante-hoc techniques and post-hoc techniques. The best approach is to use a combination of both to enhance the explainability of current AI systems.

Ante-hoc (Intrinsic) techniques:

Ante-hoc techniques ensure clarity of the model from the start. They entail baking explainability into a model from the beginning. A clear example of this is Bayesian Deep Learning (BDL), which facilitates one to gauge how uncertain a neural network is about its predictions.

Example: When a model is just 5% sure that applicant will be turned down for a loan, there is a clear indication that additional resources (like gathering more data), should be considered.

Bayesian deep learning (BDL):

BDL enables one to gauge how uncertain a neural network is about its predictions. These deep architectures can model complex tasks by leveraging the hierarchical representation power of deep learning, while also being able to infer complex multi-modal posterior distributions.

Bayesian deep learning models typically form uncertainty estimates by either placing distributions over model weights, or by learning a direct mapping to probabilistic outputs. By knowing the weight distributions of various predictions and classes, we can tell a lot about what feature led to what decisions and the relative importance of it.

Reversed Time Attention Model (RETAIN):

Researchers at Georgia-Tech developed the RETAIN model to help doctors understand the AI software’s predictions. It was introduced to help doctors comprehend why a model was predicting patients to be at risk of heart failure. The RETAIN Recurrent Neural Network Model makes use of Attention Mechanisms to improve interpretability. This attention mechanism explains as to which part the neural network was focusing on and which features helped influence its choice.

Common attention models vs. RETAIN, using folded diagrams of RNNs.

Using the standard attention mechanism, the recurrence on the hidden state vector vi (embedding the input vector xi) hinders interpretation of the model. While attention mechanism in RETAIN — the recurrence is on the attention generation components (hi or gi) while the hidden state vi is generated by a simpler more interpretable output.

The patient’s hospital visits data were sent to two RNN’s both of which had attention mechanisms. RETAIN mimics physician practice by attending the EHR data in a reverse time order so that recent clinical visits are likely to receive higher attention.

Once trained, the model could predict a patient’s risk. But it could also make use of the alpha and beta parameters to output which hospital visits (and which events within a visit) influenced its choice.

To know more, kindly continue on to Part -2.

References:

Edward Choi , Mohammad Taha Bahadori , Joshua A. Kulas , Andy Schuetz , Walter F. Stewart , Jimeng Sun — RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism
“An introduction to explainable AI, and why we need it” — FreeCodeCamp
“Explainable AI ” — KDnuggets