Game Theory’s Most Divisive Problem: Can you outsmart a nearly perfect predictor?

Christian Keil
Pronounced Kyle
Published in
14 min readApr 2, 2018
Newcomb’s Problem: Should you take one box or two?

Popularized by Robert Nozick in 1969, Newcomb’s problem has become one of the most divisive thought problems in modern philosophy. It is formulated as follows:

You sign up for a psychology study, and are told that you have the chance to win one million dollars. Obviously excited, you run to East Hall and arrive at the room where the experiment is being held. In the room, there are two boxes: one transparent, containing $1000, and one opaque, containing either $0 or $1,000,000.

Figure 1: The boxes in Newcomb’s problem.

Upon entering the study, you know these possibilities, and know that you have the following choice: either (a) take both boxes, or (b) take only the opaque box.

The catch to the game is that before you enter the room, a nearly perfect predictor has predicted your choice, and will only place the $1,000,000 in the opaque box if he thinks you will choose (b).

You are confident in the predictor’s abilities, as he has predicted your behavior correctly in the past, has never been wrong in his predictions. Knowing all this, which option do you choose?

According to expected utility theory, you should choose (b) and take only the opaque box. The logic behind this recommendation is simple and intuitive. If you choose (a), the predictor is unlikely to place the $1,000,000 in the opaque box, meaning that the expected value of choosing (a) is slightly over $1000. If you choose (b), conversely, the predictor will almost surely place the $1,000,000 in the box that you choose, nearly guaranteeing yourself $1,000,000. As you would surely gain more utility from $1,000,000 than you would from $1000, expected utility theory recommends that you choose (b).

This has become known as the “one-box” solution to Newcomb’s problem, in contrast to the “two-box” solutions that give arguments for choosing (a). A third option (the “no-box” solution) has made a minor impact in the literature on Newcomb’s problem (Maitzen & Wilson, 2003), but the modern debate is primarily a contest between two sides: the one-boxers, who use expected utility theory as described above, and the two-boxers, who use a related, but notably different system of preference evaluation named “Causal Decision Theory” (Weirich, 2008).

In particular, the two-boxers argue the one-box solution has a particularly damning flaw: your choice cannot change what is placed in either box. Once you enter the room of the experiment, the two-boxers argue, the $1,000,000 is either in the opaque box, or it is not — there is no third option — so why not take both boxes, and walk away with an extra $1000?

This focus on causality is echoed in the general philosophy of Causal Decision Theory. Its general structure is similar to expected utility theory, as it uses probabilities and utilities to measure the overall good of an action, but Causal Decision Theory argues that when an individual has no causal relation to the ends of a particular action or lottery, they ought to use the recommendations of the “dominance principle,” rather than expected utility calculation, to determine the rational course of action (Nozick, 1969).

Simply understood, the dominance principle argues that individuals ought to prefer those actions that always provide at least as much utility as any other alternative, and occasionally provide more.

Applied to Newcomb’s problem, the two-boxers use the dominance principle as evidence that (a), the two-box solution, is the correct choice. The predictor either places the money in the box or he does not, they argue, and as in either state, choosing both boxes would give more money to the participant, two-boxing is the dominant choice (Figure 2).

Figure 2: The payoffs to Newcomb’s problem, as understood by Causal Decision Theory (Weirich, 2008). As $1,001,000 is greater than $1,000,000 and $1000 is greater than $0, this formulation of Newcomb’s problem suggests that the two-box solution, (a), dominates the one-box solution, (b).

The argument made for the two-box solution, then, can be understood as dependent on two separate, independently necessary claims. First, it depends on the application of Causal Decision Theory, or more specifically, it depends on the claim that the actor in Newcomb’s problem has no causal relationship to the ends she may bring about. As Causal Decision Theory admits, if there is no such relationship, the theory does not apply (Nozick, 1969). Second, the two-box answer depends on the conclusion of the dominance principle. Even if Causal Decision Theory is appropriate generally, a denial of the dominance principle would serve as evidence to discount the two-box solution to Newcomb’s Problem — without the dominance principle, Causal Decision Theory is functionally the same as expected utility theory (Nozick, 1969).

In the following analysis, I will argue that both of these claims are unfounded: that causal decision theory is inappropriate to analyze Newcomb’s problem, and that the dominance principle does not support the two-box solution. By doing so, I will develop two independent reasons to support the one-box recommendation of expected utility theory; if the two-boxing solution is unjustified, the one-box solution is the only reasonable alternative.

To begin, I will focus on the application of Causal Decision Theory to Newcomb’s problem. I contend that this application is inappropriate for two reasons.

First, the application is simply false: an individual’s actions in the problem have an impact on her expected earnings. Causal decision theorists cite the impossibility of reverse causation to reject this possibility, arguing that the actor’s decision in the room cannot affect the decision of the predictor (i.e., an action that happened in the past). Because the actor’s decision between (a) and (b) can have no true, causal relationship to the predictor’s decision, then, her decision has no relationship to her potential earnings (Locke, 1979).

This argument quickly gives way to complex discussions of causality and free will, but the true problem with the argument is far more basic: in order for a player to benefit from choosing both boxes, the predictor must make an incorrect prediction, which is extremely unlikely. While discussions of causality and free will are interesting, they are useless — the very conditions of the problem make them unnecessary. We know that the predictor is almost never wrong, and that fact alone should be enough to accept the accuracy of his predictions.

This answer is likely unsatisfying. Individuals insist on knowing how certain “magical” acts, like the act of the predictor, are possible, as was seen with the modern cognitive-psychological rejection of “black box,” behaviorally-based arguments in psychology (Sternberg, 2008). Luckily for those individuals, there have been attempts to explain and demystify the actions of the predictor — rather than magical, the power of nearly perfect prediction seems entirely plausible. The explanations for this power are incredibly diverse, ranging from invocations of a common cause for both the prediction and the action (Eells, 1982) to quantum-mechanical exceptions to causality (Schmidt, 1998) to an appeal to the predictor’s power of psychological observation (Bach, 1987), but all conclude the same way: for whatever reason, the predictor is correct. As Price (1986) notes, the predictor must follow the “principle of total evidence” (p. 199) in order to be so accurate — simply, he has access to (and effectively takes account of) any and all information necessary to make accurate predictions.

Given the predictor’s near infallibility, it would seem irrational to bet against his power, but the two-box solution demands one to do just that. The only reasonable way to escape this accusation is to emphasize the difference between the predictor’s nearly perfect accuracy and true (i.e., 100%) perfection, (Kavka, 1980), but this distinction is trivial. If a certain action has always been correlated with a certain result, it is better to behave as if there was a known causal relationship between the two even when a true causal relationship cannot be identified (Bar-Hillel & Margalit, 1972). For example, even though the true cause of gravity is unknown, it would be unwise for individuals to stand under heavy falling objects. Simply, the correlation seen between the predictor’s predictions and your past should be enough for one to expect that the predictor will know when an individual will attempt to exploit him.

Given this relationship between the individual’s decision and her expected winnings, the application of Causal Decision Theory to Newcomb’s Problem is inappropriate. Individuals can expect to be rewarded for trusting the accuracy of the predictor’s judgments, so they ought to maximize their expected utility, and take only the opaque box.

The second major problem with the application of Causal Decision Theory to Newcomb’s problem comes through a vein of argument that has recently gained popularity — that Newcomb’s problem is a type of Prisoner’s Dilemma. Normally, finding fault with one justification of a theory would not be sufficient reason to discount the theory as a whole, but in this case, the justification sheds light on a fundamental misunderstanding of Newcomb’s problem that causal decision theorists tend to make.

This misunderstanding will be clarified shortly, but to begin, it is important to clarify the claim being made: that Newcomb’s Problem is a Prisoner’s Dilemma. This idea was first proposed by David Lewis (1979), with his formulation of a prisoner’s dilemma that shared the distinctive features of Newcomb’s problem (Figure 3). In this hybrid Prisoner’s Dilemma, one prisoner was given a choice between defecting (with a payoff matching the two-boxing decision of Newcomb’s problem) and cooperating (matching the payoff of one-boxing). To simulate the conditions imposed by the presence of the predictor, the prisoner in the dilemma is told that his accomplice had already made his decision, and that pairs of prisoners have almost always chosen the same option (either both defecting or both cooperating) in the past.

Figure 3: The payoffs a prisoner may receive in Lewis’ (1979) Prisoner’s Dilemma / Newcomb Problem hybrid.

Lewis ultimately uses this new formulation of Newcomb’s problem to argue for the two-box solution, but the content of his argument is significantly less interesting than its form — in fact, the application of game theory to Newcomb’s problem has inspired a number of responses, one of which is particularly notable for the present analysis. Using the game-theoretic framework, Christoph Schmidt-Petri (2005) argued that if Newcomb’s problem was a one-shot game (that is, that it would only be played once), it is best to apply Causal Decision Theory and defect, but if the game was repeated forever, it would be advantageous to side with the conclusions of expected utility theory, and cooperate. This interpretation may have some intuitive appeal, but ultimately misunderstands the mechanics of Newcomb’s problem — even though the “game” of Newcomb’s problem is formulated as if it is only to be played once, it ought to be treated as if it was a repeated game due to the predictor’s knowledge.

This short-sighted underestimation of the predictor is a fundamental flaw of the two-box solution to Newcomb’s problem: although the predictor is playing this game for the first time, his knowledge of you makes it seem like his decision was made in a repeated game scenario. In a repeated game, any player would know from past experience whether or not to trust you, but predictor needs no such experience. The predictor already knows who you are as a player, and therefore, can predict with great accuracy what decision you will make.

This seemingly paradoxical knowledge of Newcomb’s predictor has been described as the “hidden circularity” or “mirror” of Newcomb’s problem (Slezak, 2006), and is another fundamental reason why the application of Causal Decision Theory to the problem is inappropriate. In the same way that the knowledge gained by players in a repeated game makes attempting to maximize their expected utility the most prudent option, the knowledge of the predictor is such that any rational player ought not try to choose a “dominant” strategy.

Maximizing expected utility is the only rational choice for a player in Newcomb’s game, as the application of Causal Decision Theory is inappropriate. As we have seen, two observations stand as evidence to this conclusion: that an agent’s decision can affect her own probability of success, and that the nature of Newcomb’s predictor is such that defecting (to use Lewis’ terminology) will almost inevitably result in a lesser reward. Each of these observations warrants a rejection of Causal Decision Theory and a corresponding acceptance of expected utility theory; it is clear that expected utility theory provides a logical, coherent framework with which to analyze Newcomb’s problem, ultimately showing the possibility of expected utility as an explanation of the preferences of an individual.

The second major line of argument against the two-box solution to Newcomb’s problem and for the acceptance of expected utility theory comes as a critique of the dominance principle. As explained earlier, the dominance principle is a necessary element of the two-box proof — even if Causal Decision Theory might be an appropriate framework with which to analyze Newcomb’s problem, it would not matter unless the dominance principle gave a warranted reason to prefer one-boxing. I will contend that it does not, for two reasons.

First, the proof that the dominance principle uses to justify two-boxing is based on a set of faulty counterfactuals. The primary claim of the dominance principle is that the $1000 in the clear box is always available as an extra reward to any individual given Newcomb’s problem — if an individual chose to only take the opaque box and was awarded $1,000,000, they might see the $1000 as a foregone prize (as seen in Figure 2). In fact, this logic is sound with most games involving risk. For example, if I was a contestant on the Price is Right, I would likely kick myself if I accepted the first showcase, only to find that the Jetski was actually behind the second curtain.

This “try now, evaluate later” approach, however, does not work in the world of Newcomb’s Problem. Given the uniqueness of the predictor’s task — attempting to predict your choice, knowing that you know his abilities — you are in no better position to evaluate your expected payoffs after playing the game than you might have been before playing it (Kavka, 1980). As we saw above, the predictor’s nature is such that your decision between one-boxing and two-boxing has an impact on the payoffs you may receive, and the very possibility of this impact is enough to deny the validity of the counterfactuals two-boxers use as evidence for the dominance principle. Such comparisons are useless in a world as dynamic as that of Newcomb’s problem — the very possibility that the predictor may change his action if you reason differently about yours makes such simple counterfactuals inaccurate, and therefore, the dominance principle may not give an accurate answer to Newcomb’s problem.

The second, and decidedly larger, problem with the dominance principle’s application to Newcomb’s problem once again comes as a result of a mischaracterization. According to the picture drawn by the dominance principle (Figure 2), the state of nature is such that upon entering the room, the predictor has made one of two choices: either to place the money in the box, or to refrain from doing so. This game-theoretic understanding of the problem, however, seems to be inaccurate.

Based on work done by John Ferejohn, Steven Brams (1975) gave a strong argument for an alternate interpretation of Newcomb’s problem — in contrast to the game-theoretic picture drawn by Lewis, Brams argues that the problem ought to be understood as “decision-theoretic” (p. 599). The distinction between the two is subtle, but extremely influential. Rather than treating the predictor’s decision as set once one enters the room, the decision-theoretic approach argues that the true “state of nature” lies in the predictor’s accuracy. When you enter the room, Brams argues, the predictor’s choice is either correct or incorrect, and this understanding clearly defeats the traditional conclusion of the dominance principle, as neither choice (one- or two-boxing) dominates the other (Figure 4). The only hope for the application of the dominance principle to escape this claim is to criticize Brams’ reformulation, but as we will see, this new conceptualization is justified.

Figure 4: The (newly realized) payoffs of Newcomb’s game (Brams, 1975). As this illustration makes clear, neither decision dominates the other ($1,000,000 is bigger than $1000, but $0 is less than $1,001,000) rendering the dominance principle useless.

One potential justification of Bram’s reformulation is the claim discussed extensively above — that there is a strong relationship between an individual’s actions and her potential payoff (Slezak, 2006) — but a far simpler observation is sufficient justification by itself: that the predictor’s accuracy is independent of an individual’s choice (Cargile, 1975). It seems reasonable, and surely more simple, to assume that this is the case. Indeed, the accuracy of the predictor is significantly easier to conceptualize as a “state of nature” in Newcomb’s problem that is the predictor’s actual decision — while the probability that the predictor is correct is specified a priori by the problem itself, his prediction is undeniably dependent in some way on your decision. As Maya Bar-Hillel and Avishai Margalit (1972) explain, “You cannot outwit the [predictor] except by knowing what he predicted, but you cannot know, or meaningfully guess, at what he predicted before actually making your final choice” (p. 302). The independence of the predictor’s accuracy thus serves as strong evidence for Bram’s reformulation, and (as shown in Figure 4) means that the dominance principle is inapplicable to Newcomb’s problem.

In the end, it is clear that expected utility theory’s one-box solution to Newcomb’s problem is superior to the two-box solution rendered by Causal Decision Theory. As we found, there are two fundamental problems with the two-box solution. First, Causal Decision Theory was determined to be inappropriate for Newcomb’s problem, as a player’s payoff very clearly depends on their actions. The predictor’s power ought not be underestimated — he is nearly perfect, and therefore knows what to expect — but Causal Decision Theory does just that. An individual’s actions affect her potential winnings, so Causal Decision Theory cannot apply to Newcomb’s Problem. Second, the dominance principle — the way in which Causal Decision Theory determines preferences — is based on faulty justifications and mischaracterizes the state of nature in Newcomb’s problem.

So, what do you think? Would you choose one box or two?

I wrote this paper in 2011 as a junior at the University of Michigan; it eventually won the Sweetland Prize for Excellence in Upper-Level Writing and the Bowling Green State University Economics Paper Contest.

References

Bach, K. (1987). Newcomb’s problem: The $1,000,000 solution. Canadian Journal of Philosophy, 17(2), 409–425.

Bar-Hillel, M. & Margalit, A. (1972). Newcomb’s paradox revisited. The British Journal for the Philosophy of Science, 23(4), 295–304.

Bernoulli, D. (1738). Exposition of a new theory on the measurement of risk. Translated from Latin into English by Dr. Louisse Sommer in 1954. Econometrica, 22(1), 23–26.

Brams, S. J. (1975). Newcomb’s problem and prisoner’s dilemma. The Journal of Conflict Resolution, 19(4), 596–612.

Cargile, J. (1975). Newcomb’s paradox. The British Journal for the Philosophy of Science, 26(3), 234–239.

Eells, E. (1982). Rational decision and causality. Cambridge University Press.

Horwich, P. (1985). Decision theory in light of Newcomb’s problem. Philosophy of Science, 52(3), 431–450.

Kavka, G. S. (1980). What is Newcomb’s problem about?. American Philosophy Quarterly, 17(4), 271–280.

Kiekeben, F. (2000). Newcomb’s Paradox. Short Essays on Philosophy, Retrieved online at: http://www.kiekeben.com/

Lewis, D. (1979). Prisoner’s dilemma is a Newcomb problem. Philosophy & Public Affairs, 8(3), 235–240.

Locke, D. (1979). Causation, compatibilism and Newcomb’s problem. Analysis, 39(4), 201–211.

Maitzen, S. & Wilson, G. (2003). Newcomb’s hidden regress. Theory and Decision, 54(2), 151–162.

Martin, R. (2008). The St. Petersburg paradox. Stanford Encyclopedia of Philosophy. Robert N. Zalta.

Nozick, R. (1969). Newcomb’s problem and two principles of choice. Essays in Honor of Carl G. Hempel, 114–146.

Price, H. (1986). Against causal decision theory. Synthese, 67, 195–212.

Schmidt, J. H. (1998). Newcomb’s problem realized with backward causation. The British Journal for the Philosophy of Science, 49(1), 67–87.

Schmidt-Petri, C. (2005). Newcomb’s problem and repeated prisoner’s dilemmas. Philosophy of Science, 72(5), 1160–1173.

Slezak, P. (2006). Demons, deceivers, and liars: Newcomb’s malin génie. Theory and Decision, 61, 277–303.

Starmer, C. (2000). Developments in non-expected utility theory: The hunt for a descriptive theory of choice under risk. Journal of Economic Literature, 38, 332–382.

Weirich, P. (2008). Causal decision theory. Stanford Encyclopedia of Philosophy. Robert N. Zalta.

--

--

Christian Keil
Pronounced Kyle

🛰️ By day, I help improve global internet access. ✍🏼By night, I help make the internet a better place to be.