Bayesian reasoning is deductive, not inductive

“If anything is to be probable, then something must be certain.” -C.I. Lewis

Image for post
Image for post
Ferdinand Hodler, Le lac Léman et le Mont-Blanc

Can evidence support an hypothesis?

Empirical support lies at the core of our idea of rationality. We ask for “evidence-based policies”; we admonish each other to “back up claims with data”; we reject statements that are not “supported by the facts”. And yet, as a matter of logic, the idea of empirical support is surprisingly difficult to pin down.

This is the problem of induction, most famously stated by David Hume, who believed it to be insoluble. Our knowledge of the world consists of theories that explain what we see in terms of things that we don’t see. How can we infer general theories from limited observations? We can’t deduce them from the evidence, since their very nature is to go beyond the evidence. Can a theory be confirmed by the evidence compatible with it, or made more probable? Does evidence allow us to feel more confident in our beliefs? If so, by which kind of logic?

These questions are central to a profound debate in philosophy of science. Steven Pinker made a passing reference to this debate in Enlightenment Now:

The first answer is arguably the most famous. According to Popper, evidence can never, in any way, support or justify a theory, or make it more probable. He believed that David Hume’s statement of the problem of induction was “a gem of priceless value for the theory of objective knowledge: a simple, straightforward, logical refutation of any claim that induction could be a valid argument, or a justifiable way of reasoning”.

Popper’s solution comes from the realization that we do not need induction to create knowledge. The fact that a scientific theory cannot be supported by evidence does not amount to a demonstration that it is false: whether or not a theory is true is independent from whether we can prove it. Science, according to Popper, is based on the logical asymmetry between verification and refutation. No amount of evidence can ever prove that a theory is true: however, if any statement deducible from a theory is false, it proves that the theory is false. We can create knowledge, therefore, by making unsupported and unjustified guesses, and seeing which ones withstand our attempts to refute them.

But Popper’s negative account of empiricism proved difficult to accept. The idea of supporting evidence is a resilient one. In Fashionable Nonsense, Sokal and Bricmont expressed a common criticism of Popper that resurfaced many times in the history of philosophy:

The second approach mentioned by Pinker, Bayesian reasoning, is seen as a possible remedy. According to Bayesianism, probabilities represent degrees of belief in statements, which can then be incremented or decremented according to the evidence. The idea is simple. We start with a set of possible hypotheses, each with a given probability of being true. The probability distribution is supposed to incorporate all the relevant information we already have: if we know nothing else, all possibilities will have equal probability. Then, we look at the evidence, and ask ourselves: how probable was it to observe that evidence, given each possible hypothesis? Using a famous mathematical rule called Bayes’ theorem, we can then update the probability of each possible hypothesis, given the probability of the evidence. Reasoning in this way is also known as “inverse probability”, because instead of computing the probability of observations according to causes, we assign probabilities to possible causes, according to our observations.

This is often seen as a rigorous, mathematically impeccable formalization of empirical support and rationality itself. Bayesianism was adopted by several popular science authors, including Sean Carroll and Nate Silver, and enthusiastically promoted by the online group of thinkers known as the “Rationalist community”, organized around the writings of Eliezer Yudkowsky and Scott Alexander.

In what could arguably be considered the Bible of Bayesianism, Probability Theory: The Logic of Science, the late E.T. Jaynes had some scathing criticism for Popper and others who have denied the possibility of induction. He refers to them as the “irrationalists” and criticizes Popper in these terms:

Popper always rejected the idea of searching for probable theories. On the contrary, because we want theories with high informative content that make specific predictions, he argued that a better theory will always mean a less probable theory. In a paper titled “A proof of the impossibility of inductive probability”, Popper and his collaborator David Miller set out to demonstrate, in a technical fashion, that the part of an hypothesis that is not deductively entailed by the evidence is always strongly counter-supported by it. According to them, “this result is completely devastating for the inductive interpretation of the calculus of probability”.

According to Jaynes, “written for scientists, this is like trying to prove the impossibility of heavier-than-air flight to an assembly of professional airline pilots.”

As an adherent of the Bayesian approach to statistics and probability, and an admirer of Jaynes, my thesis here is that Popper was right. Rationality, including Bayesian reasoning, does indeed consist only of deductive logic. (As David Miller put it, “the use of Bayes theorem does not characterize Bayesianism any more than the use of Pythagoras’ theorem characterizes Pythagoreanism”).

I believe the debate between Bayesians and Popperians comes from a misunderstanding of the word “induction” as used by Bayesians. Bayesian inference is not a form of induction: it is entirely deductive. If we have a “definite set of specified alternatives” with a probability distribution, and if we can use this model to compute the probability of future observations under each of those alternatives, the subsequent modification of the probability distribution is logically entailed by our premises — which makes it a deductive inference. We are not learning anything beyond what we already put into our model and what we subsequently observe: we move smoothly from a prior set of assumptions to a posterior set of conclusions, according to clear mathematical rules.

It seems preposterous to suggest that such an important philosophical debate turns on a misuse of words, but I really believe that’s what’s happening here. We were misled to call Bayesian inference “inductive probability” because it makes it look like evidence can support an hypothesis without deductively entailing it. But in fact, the evidence only supports that hypothesis via a prior set of probabilistic assumptions that are not supported by the evidence.

This is how David Miller expresses the problem:

More fundamentally, it is easy to see how Bayesianism fails as a philosophy of science. The logic of science does not consist of picking the most probable explanation from a set of preordained alternatives — it consists of creating new ones and putting them to the test. The set of all possible scientific explanations does not obey the probability calculus, simply because they cannot be known in advance. As David Deutsch observed, the negation of a scientific explanation does not constitute an alternative explanation.

Jaynes seems to think it’s ridiculous to talk about the set of all possible scientific explanations, because such a set is not well-defined in terms of probability theory. But this is precisely the point. Anyone concerned with the truth must admit that the answers we are looking for may not already be contained in our existing models. Given a set of alternative hypotheses, the probabilities we assign to them depend upon the validity of that model — which remains mysterious. This is what makes Bayesianism a static philosophy of science. It is not compatible with the growth of knowledge — the creation of new explanations and new models.

Furthermore, the probabilities computed by a model have nothing to do with the probability of the model itself being true. If evidence can deductively change a probability distribution, via a framework of assumptions, in no way can it “support” that framework as a whole. Even if a Bayesian model achieves extraordinary predictive accuracy, that accuracy does not logically imply that the model contains any truth about the world (although you might conjecture that it does to explain why it works so well). There could always be better explanations. In the Popperian view, it’s the model as a whole, with its assumptions about the set of possibilities, that should be seen as conjectural, with its better alternatives waiting to be conjectured into existence. No amount of predictive success can tell you that your model is probably true — except, maybe, in light of another, more general model, subject to the same objection.

The most elegant statement of that argument comes from Jacob Bronowski:

As a final note, I want to give an example of the misuse of probability theory to express epistemological truths. Sean Carroll and Nate Silver both remark that when a Bayesian thinker assigns a probability of 1 or 0 to a given statement, it means that no evidence will ever change their mind. Thus, to reflect the uncertain and revisable nature of scientific knowledge, they somehow imply that there is something irrational about thinking that something has a probability of one or zero. This idea is also known as Cromwell’s rule, after the famous quote from Oliver Cromwell: “I beseech you, in the bowels of Christ, think it possible that you may be mistaken.”

This, to me, is a misconception. If I fill an urn with black marbles, it is not irrational, based on my model, to say that there are 100% chances that the next marble I’ll draw will be black. It’s not an assertion of epistemic or metaphysical certainty, or a form of dogmatism. It’s a straightforward deduction from the information I have about the content of the urn. The model itself is still conjectural. Any result other than a black marble would flatly refute it. What’s irrational is not assigning probabilities of 1 or 0: it is holding on to models that don’t work, perhaps because they wrongly assigned probabilities of 1 and 0.

I beseech you, in the bowels of Christ, to see the difference.

So, can data support an hypothesis? My answer: yes, in a deductive manner, given a well-specified set of all possibilities known in advance, and prior conjectures about what the evidence would look like under each of those possibilities.

The resilience of the idea of empirical support may be due to the fact that, since a rational thinker can only know a finite set of possible alternative explanations, the psychology of belief and our subjective sense of plausibility could reflect in some way the mathematics of Bayesian probability, in the sense described by Sokal and Bricmont. For practical purposes, it’s possible that the idea of evidential support for our beliefs cannot be uprooted from the human mind. However, we should be very clear about what we mean by that. Such a support can only be deductive and mediated by models consisting of unproven and often implicit conjectures.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store