“If you hand me your wallet,” says the mugger to Pascal, “I will perform magic that will give you an extra 1,000 quadrillion happy days of life.” So goes the thought experiment named Pascal’s mugging, created by Nick Bostrom.
The setting is as follows. You are approached by a mugger in the street who, instead of robbing you with force, promises to grant you thousands of days of bliss if you give her your wallet. While her claim is obviously implausible, you can’t prove it is impossible. Thus, given a large enough promised reward, the expected value of handing over your wallet is positive. (The expected value of an event is the probability that it will occur multiplied by the utility incurred. If you bet $10 on a fair coin turning up heads, then the expected amount of money you will receive is $5.)
Suppose you assign a probability (or “credence”) of, say, one in one-million to the mugger telling the truth. If she offers you ten-million years of happiness, the expected value of yielding your wallet is still ten days of happiness. Indeed, whatever the probability, there will always be a reward large enough to compensate. Consequently, after struggling to see a way out of this arithmetically thorny situation, Pascal, as any rational expected-utility-maximizer should, hands over his wallet.
As bizarre as the problem may sound to those unfamiliar with expected value reasoning, this “paradox” has been the subject of much attention in the effective altruism and rationality communities. There have been numerous attempts to solve it, most involving some manipulation of the underlying mathematics. Robin Hanson argued that we should penalize the prior probability based on our incredulity of the mugger’s claims. Gwern examined and rejected the possibility of placing an upper bound on utilities or rounding low probabilities to zero. Nintil argues that the resolution is to choose the correct model. Briggs argued that unconditional events don’t have probabilities.
Disturbingly, no one seems to accept that the problem may be inherent to this style of reasoning itself. Indeed, when one steps outside the (entirely self-imposed) confines of mathematical reasoning, the solution to the riddle is simple: Pascal should not hand over his wallet because the mugger offers a bad explanation. To many, this solution will undoubtedly appear quite shallow at first glance. No sample space was defined, Bayes’ theorem did not make an appearance, and we didn’t have to use Knuth’s arrow notation. This solution, however, is the result of a deep insight into the nature of knowledge and how it is generated. This epistemology has been expounded upon by a few philosophers, and has been best especially refined by the physicist David Deutsch in his book, The Beginning of Infinity.
Explanations are the roots of our knowledge. What we prize most in our best, most insightful theories is their explanatory power. The germ theory of disease explains the transmission of bacteria and viruses, tectonic plates explain the earth’s geography and its movement, the axis-tilt theory of the earth explains seasonality. More than simply ad-hoc justifications, the above are examples of good explanations. They are risky: they posit a structure of reality that can be contradicted by observation. They are also strict (in Deutsch’s language, hard-to-vary): an explanation that can be changed in order to compensate for any observed phenomena explains nothing. The explanation that storms are caused by Zeus (or, more generally, that anything is caused by God) is a bad explanation because it does not posit anything about the way the world works.‘God did it’ can explain anything, and therefore explains nothing. It is simply an appeal to epistemological authority.
Importantly, we do not assign credences to different explanations. We do not believe that Newtonian mechanics holds with probability 0.19, and Einsteinian relativity with probability 0.81. Relativity is a better explanation, so we dispense with Newtonian mechanics and seek to improve relativity. The reason for this is an asymmetry between truth and falsity, and how we create knowledge.
The human condition is first and foremost one of fallibility. The truth is not obvious. If it were, there would be no need for the scientific establishment, there would be no religious wars, there would be no ethical disputes. Instead, generating knowledge is difficult, and even our best theories — scientific, political, ethical — are riddled with errors. Certainty is beyond reach. There is no final justification for our knowledge, no magical oracle we can consult for truth. Instead, there is only trial and error. We generate hypotheses and seek to refute them, to falsify them. The ones that stand up to our best criticism are the ones we abide by — whether it is a social policy or a scientific theory — but we never stop trying to improve them.
When it comes to discriminating between theories, our belief in their veracity is irrelevant. What’s important is ridding ourselves of false theories. Whatever credence various physicists gave to the unification of electricity and magnetism prior to the experiments of Hans Ørsted didn’t matter. Once Faraday and Maxwell developed the theory of electromagnetism, any good physicist adopted it as the best available explanation, and then started trying to improve it. Subjective probability estimates have nothing to do with it.
The view of knowledge whereby credences and beliefs are primary is called Bayesian epistemology. The Stanford Encyclopedia of Philosophy writes that
“The formal apparatus [of Bayesian epistemology] itself has two main elements: the use of the laws of probability as coherence constraints on rational degrees of belief (or degrees of confidence) and the introduction of a rule of probabilistic inference, a rule or principle of conditionalization.” (emphasis mine)
In other words, it attempts to govern how we should think and gives us rules dictating how convinced we should be in a theory given the current evidence. It is also incoherent and based on the provably wrong methodology of probabilistic induction. Bayesian methods are, of course, very useful in certain scientific enterprises (Bayesian inference in machine learning, for instance). Indeed, its success in narrower disciplines is arguably what led to its application in epistemology . This is, however, an unwarranted overextension of the methodology. Needless to say, Pascal’s mugging is not the only (pseudo) paradox that arises when you insist on quantifying degrees of belief.
There are always multiple explanations for a given phenomenon (infinitely many, in fact). As opposed to assigning each explanation a number and trying to generate knowledge by mathematical manipulation, we should recognize that knowledge creation is a creative discipline. Generating good explanations is very difficult; indeed, most of our ideas are bad ones. A bad idea should not be respected by giving it an arbitrary value between 0 and 1 and continuing to use it in our decision-making processes. Ideas deserve to be criticized ruthlessly, and to be discarded if they prove untenable. Moreover, Bayesianism does not account for how we generate good explanations in the first place. What was Marie Curie’s “credence” in the existence of polonium and radium? What about the Greeks’ “credence” in this new, weird form of government called “democracy”? New theories require bold creativity and an insistence on questioning established knowledge. They rarely seem likely a priori. We do not call good ideas “breakthroughs” for nothing.
So, back to the mugger. Her claim that she comes from another dimension (or in alternate versions of the mugging, to be able to inflict pain on an astronomically large number of people instantaneously) is a bad explanation. It is easy to vary — she can arbitrarily change her claim based on whatever you say your belief is. Just as the gods may be credited with any power that explains the most recent event, the mugger may change the offered reward to suit whatever your “credence” is. Moreover, the truth of her claim rests on a theory of reality not corroborated by our best evidence. Hence, just like we discard other bad explanations — astrological predictions, flat-earth theories, alien abductions — so too do we discard the mugger’s, and refuse her our wallet.
Ben Chugg is a research fellow at Stanford Law School, working at the intersection of computer science and public policy. He writes about ethics, knowledge, and progress. benchugg.com
Acknowledgements: Thanks to Vaden Masrani (@VadenMasrani) for helpful suggestions on an earlier draft of this essay.
Prior Probability: Probability of an event before receiving new information.
Sample Space: The set of all possible outcomes of an experiment.
Conditional Probability: The probability of an event given other information, e.g., the probability that it rains given that it’s cloudy.
Bayes’ Theorem: A widely used equation regarding conditional probabilities.
Epistemology: The study of knowledge.