Why Can’t I Tell What My Neural Network is Doing?

Malay Haldar
The Startup
Published in
6 min readMay 9, 2020

As a machine learning engineer building neural networks for search ranking, I sometimes get the (very reasonable) question: Why are things ranked so in this particular search result?” The answer, in three honest words: I don’t know.

Then I immediately realize how bizzare that sounds. If you ask the engineer across my desk a question like “Why is this search result taking 10 seconds to load?”, you expect an answer. So why can’t you ask why the neural network ranked this thing at the number #1 spot? To salvage the situation, I apologetically offer, “I can’t tell you anything about this particular example, but I have somewhat of an answer if you ask me about a million search results.”

And then I realize the conversation has lost any sense of sanity. The only remaining thought on the other side is perhaps “Thank God this guy is not a doctor, otherwise he will say I have no clue what I am doing to you, but I’ll save millions of patients.”

To understand where this paradox stems from, we need to start with a little thought experiment.

Suppose I give you two coins. One is a fair coin, with equal probability of 1/2 for a head or tail. The other one is slightly loaded towards head, its probability of getting a head is 501/1000, whereas for tail it’s 499/1000.

How do you tell the difference between these two coins?

If you toss each of the coins once, and get head on one, and tail on the other, will you conclude the loaded coin is the one which landed head? Of course not. Because there’s a (499/1000)*(1/2) = 499/2000 chance of the fair coin landing head and the loaded coin landing tail. Which is very close the chance of the loaded coin landing head and the fair coin landing tail, (501/1000)*(1/2)=501/2000. So one flip is clearly not enough to decide. How about 10, 100,…?

Intuitively, as you do more and more flips, the chances that the loaded coin will land more heads than the fair coin keeps increasing. After say, a million flips, you can confidently say that the coin with the larger number of heads is the loaded one.

Formally, if p is the chance of a coin landing head, then the chance that it will land more heads than tails in 2N flips is given by:

If that looks complicated, feel free to ignore it. Instead look at the graph below which plots the function above for the loaded and the fair coins. As we keep increasing the number of flips along the x-axis, we see that the probability of getting more heads than tails for the fair coin converges towards 0.5. For a fair coin the chance of getting more heads than tails is equal to the chance of getting more tails than heads.

For the loaded coin, the chance of getting more heads than tails approaches 1.0 instead.

Note the paradox here! For a single flip of the coin, we can tell virtually nothing about the two coins. After 2 million flips, we can say with certainty which is the loaded coin. We don’t know anything more about the inner workings of the two coins in both the situations. The only new information we gained was how the two coins behave over 2 million flips in aggregate.

So what does all of this have to do with neural networks and search ranking? Building neural networks to optimize search ranking is close to this business of building a loaded coin.

To build a neural network, we look at millions of search results, and the associated outcomes with it. For example, the search results users clicked on, the items from the search results they bought, etc. Then we go and juggle the numbers inside the neural network so that the search results are “loaded” towards the desired outcome, like items with higher chances of getting bought being ranked higher.

Note that the very process of building the neural network is based on looking at millions of search results in aggregate and optimizing to make the search results “loaded” at that scale. The process does not work for one search result at a time, and trying to build a neural network by looking at a small number of examples leads to disastrous results, technically referred to as overfitting.

So asking what the neural network is doing for a particular search result is like asking how the loaded coin is working for a single flip. The answer in both cases: don’t have a clue. But that doesn’t mean we know nothing. Given millons of trials, we can tell with certainty a loaded coin from a fair one, and better search results from inferior ones. Machine learning engineers work towards producing these search results that are better than previous one, by a tiny fraction, typically in 0.5% to 1% range. While they can certainly improve the search results in this fashion, they cannot answer what is going on in a particular search result.

As always, there are caveats. For some particular class of problems, like using neural networks to process images, one can look inside the neural networks and sometimes locate the part of the image the neural network was focusing on. Sometimes you can build an explainer on top of your neural network. But chances are, for your particular application, you’ll continue to struggle explaining what is going on for any particular instance of the problem.

Fun fact: a large part of the math regarding how to tell the difference between whether something is actually working or just working by chance was derived from studying effects of medicine on patients. So the next time you take pills from your doctor, it may be worth thinking if you are participating in some paradox of statistics.

Note For Advanced Readers

Some people may raise the question: what about models other than neural networks, for instance logistic regression? I can fully explain their behavior, so why can’t I do the same for neural networks?
To answer that we can place models in three broad categories:

1) Fully explainable: These are linear models like logistic regression.

2) Partially explainable: Models like GBDT.

3) Unexplainable: Models with sufficient non-linearity like neural networks.

To understand the differences, consider the model as a function of its inputs, written as f(x0, x1, x2,..). Now let’s assume we are interested in finding the influence of a particular input, say x2, on the output of the model.

This influence can be represented as the differential of f(x0, x1, x2, ..) by x2, written as d(f(x0, x1, x2,..)/d(x2).

In case of linear models like logistic regression, this differential reduces to a constant. Ergo, full explainability.

For neural networks, this differential is yet another non-linear function over the inputs, which can be represented as g(x0, x1, x2,…). Now there is no simple way to interpret this arbitrary non-linear function. And therefore, no explainability.

But then what about GBDT? If we go back to the paper which introduced the notion of partial dependency plots for decisions trees, you’ll notice that construction of the partial dependency plots comes with a caveat, which says the plots are valid, assuming independence of variables. In practice, for GBDTs the non-linear interactions between variables happen to be mild enough in a lot of cases that people find value in making that assumption, and find value in constructing the partial dependence plots. But where strong interactions exist, the assumptions are simply wrong, and the plots no longer meaningful. For neural networks, complex non-linearity happens to be the rule rather than the exception, which makes the independence of variable assumptions invalid in most cases.

See also:

How Do Neural Networks Work?

binom.pmf() used to generate the graph

--

--