The great AI debate: Interpretability

Published in

The Startup

7 min readJul 19, 2019

Deep learning (DL) has crept into almost all parts of Artificial Intelligence. DL methods constitute the state of the art for almost all tasks in image processing, natural language processing, recommendation systems…etc. Consequently, as more and more of these models are being deployed, the interpretability of these models is coming into question. Deep learning is a form of machine learning based on learning hierarchical, distributed models to solve difficult AI problems. Broadly speaking, these models work extremely well for most tasks, but because they have so many learnable parameters (order of millions), it’s hard to predict what the effect of 1 single parameter, or a few of them, would be on the overall neural network.

Numerous amount of research effort has been put into discovering fancy methods of visualizing network connections, feature maps from convolutional neural nets etc. to enhance neural network interpretability. ‘Feature visualization’, ‘Building blocks of interpretability’, ‘Activation atlases’, and ‘Visualizing memorization in RNNs’ are some excellent articles on distill. These show that a lot of information about these black box models can be uncovered ‘if truly required’.

Whether this is required or not, is one of the hottest topics in contemporary AI. To throw further light on this topic, 2017 held the first ever NeurIPS debate on the topic: Interpretability is necessary in machine learning. But before I talk more about the debate, I have one fundamental question to answer, what does it mean to have an interpretable model? What makes a model understandable or explainable?

This question can be answered in several ways, but what I think people mean most often when they say ‘interpretable ML model’, they mean that the decisions of the model can be well explained by its parameters, working, and structure. Two models immediately come to my mind when I say this: Linear regression and decision trees.

Explaining linear regression predictions

Linear regression is often used to model the behavior of one target variable with respect to one or more inputs. Once the model is trained (although coefficients can be computed in a closed form), we get something like this:

Target = A*Features + B

Here, A is the vector of learnt coefficients and B is the bias term. Whenever this model is used to make a prediction, it is extremely easy to see why those decisions can be explained. All one has to do is look at the coefficients! Features corresponding to higher coefficients are more important in the prediction, simple.

Explaining decision tree predictions

It is also fairly obvious to see why this type of model is explainable. The decision tree can be visualized and the decision taken by every node in the tree can be seen. An example of this can be seen in the image below where the target variable is number of bedrooms. The features are number of family members, marital status and salary. Any decision made by the tree can be traced back using this visualization and thus can be explained.

However, using ensemble methods of multiple trees and using techniques such as gradient boosting does significantly reduce this interpretability factor.

These explainable models are decent for low-scale low-cost machine learning using small datasets. However, when solving more complex problems requiring use of more data and dimensions, they significantly fall short of their ‘not so interpretable’ counterparts, i.e. neural networks. In general, the rough estimate of accuracy vs interpretability trade-off chart is given below.

This makes us wonder, do we actually need our results to be explainable if we’re able to achieve better accuracy? This brings us back to the original NeurIPS 2017 debate: Interpretability is necessary in machine learning.

Both sides in the debate (2 vs 2) had excellent points by renowned people in the field of AI.

FOR: Rich Caruana (Senior researcher at Microsoft)
FOR: Patrice Simart ( Deputy Managing Director at Microsoft Research)
AGAINST: Kilian Weinberger, Associate professor at Cornell University
AGAINST: Yann LeCun, VP and Chief AI Scientist at Facebook

AGAINST: Me, but unfortunately I wasn’t invited to the debate 😐

FOR

The main point was put forth using a real world example of the Pneumonia risk prediction problem. It goes like this:

A lot of pneumonia patients are looking for treatment but not all of them can be treated at once due to limitation of resources. Thus, a deep neural network is trained to distinguish between the low-risk vs the high-risk patients to determine whom to treat first. This model is extremely accurate on the training data and is used to carry out the predictions. After thorough inspection by experts, it is found out that the neural network has learnt something unusual from the dataset: Patients with a history of asthma are extremely low-risk and do not require immediate treatment. This sounds counter intuitive because pneumonia is a lung disease and asthmatic patients would tend to be more susceptible to it, thus being high-risk patients. After further detailed analysis, it was found out that asthmatic patients in the training data were low risk for pneumonia because they tend to get professional treatment much earlier than patients without asthma. This is because they know their lungs are not in perfect condition, and seek professional advice even if they feel something very small going wrong. But people without asthma tend to delay getting treatment until the problem becomes more grave.

So in fact, it was the time to treatment due to which asthmatic patients were low risk in the training data, and not due to the fact that they had asthma! Thus during training, the variable ‘is asthmatic or not’ acted as a proxy for ‘time-to-treatment’. Had experts not been involved to identify this obvious mistake, it would have been impossible to point this out and asthmatic patients would be neglected, leading to fatal consequences.

Overall, this shows that machine learning models tend to learn the characteristics of the training data, are not easy to interpret. High accuracy should not be mistaken for good performance. ML models’ decisions need to be explained so that such mistakes made by any model can be caught easily.

AGAINST

Here, the debaters did acknowledge the importance of experts in sensitive fields such as medicine, law, crime etc. However, the fact is: Even today, machine learning systems make millions of decisions per second without anyone wanted to recheck them, look at them, or interpret them. No one has the time, and users are extremely happy. Examples include recommendation engines, machine to machine translation, and to a certain extent, autonomous driving. The whole point of automated systems is to free the human mind of making trivial decisions and doing typical work (driving!), and the point of this would be futile if we instead spend this free time trying to interpret the decisions made by these models. What is important however, is the accurate and safe functioning of these systems at test time, and users of these systems to be satisfied.

Yann LeCun pointed out that when working in industry, clients would ask for models that are extremely explainable, interpretable and that they don’t care about anything else. However, when these clients were presented with two models (one being extremely explainable and 90% accurate, and the other behaving completely like a black box but having a superior accuracy of 99%), they would always choose the more accurate one. This just shows that people don’t really care about interpretability but just want some sort of reassurances from the working model.

Overall, interpretability is not important if you can prove that the model works well in the condition it is supposed to work in (i.e. test time). Good testing procedures such as A/B testing, gradually increasing deployment etc. are useful to confirm the ability of a model to perform. Moreover, neural networks are not really black boxes and a thorough sensitivity analysis can be done to look at the effect of all variables. The analysis might not be as explicit as, “This coefficient is extremely large, this feature must be important”, but it can be done.

As a final thought, I also feel that interpretability is overrated, and is just a measure of reassurance that people seek before using the ML model. However, I do concede that certain domains such as law and medicine require expert analysis of ML models before being used at a large scale. This too will not be required when AI reaches a stage when models are intelligent enough to be able to interpret latent meanings in training data (such as variables acting as proxies for other missing variables in the example given in the FOR section).

Resources

https://www.youtube.com/watch?v=93Xv8vJ2acI

The great AI debate: Interpretability

Explaining linear regression predictions

Explaining decision tree predictions

FOR

AGAINST

Resources

Written by Kirthi Shankar Sivamani