The Riddler — Competitive Coin flipping
The Riddler is a weekly puzzle hosted on FiveThirtyEight with problems on logic, maths and statistics. For this week’s express puzzle, we have an interesting problem on event sequences -
Riddler Nation recently legalized sports betting, and of course everyone is excited to risk some cash on the national pastime: competitive coin flipping.
Every coin-flipping match is a faceoff between two teams. Each team selects a sequence of heads and/or tails to look for, and they simultaneously flip their own coin over and over until one team finds its sequence. If both teams find their sequences at the same time, they start over and flip until only team finds it. First to be the only team to have found its sequence wins.
You decide to get in on the action and go to one of these events. When you arrive, you see that the Red Team has chosen the sequence “heads-tails,” while the Blue Team has chosen “heads-heads.” You can get even odds on either team. Which team should you put your money on?
On the first glance, it looks like the problem is trivial and uninteresting. Assuming the coin is fair, it would appear that any sequence of heads or tails of a fixed length should occur with equal probability. And thus, both the teams have an equal chance of winning the game.
However, as it so often happens when dealing with probability, our ‘intuition’ pushes us in the wrong direction. Both teams would like to choose a sequence that appears in fewest coin flips on average. Thus, if we find the expected number of flips until a sequence is found, we’ll know which team to bet on.
Let R be a random variable that denotes the total number of coin flips until the Red Team finds its sequence (“heads-tails”). And B be the analogous random variable for Blue Team.
Consider the event where R = n, that is, the first instance of sequence HT appears at n-th coin flip. The sequence of outcomes can be written as a string:

Now for R to be n, the (n-1)-th and n-th flip outcome would have to be a head and a tail respectively. Furthermore, this would be the first instance of HT in the string. This forces us to only consider strings that begin with zero or more consecutive tails followed by zero or more consecutive heads upto the (n-2)-nd coin toss.
H H H … H H H T
T H H … H H H T
T T H .… H H H T
T T T ..… H H H T
T T T ….. T H H T
T T T ..… T T H T
For a string of length n, there are can be n-1 such valid strings (with leading tails ranging from 0 to n-2). As the coin is fair, probabilities of heads and tails are 1/2 each. Recall that every coin toss is independent, therefore probability of occurrence of each string of length n would be (1/2)^n.

Thus, the Red team should expect to find its sequence after 4 coin tosses on average. Let us do the same exercise for Blue team. Again we have the same sequence of outcomes:

Now for B to be n, the (n-1)-th and n-th flip outcome would both have to be heads. In this case, for the initial n-2 flips, we need to ensure that no two heads appear consecutively. To calculate the number of such strings, we can imagine that we are arranging tiles of T and HT to form an n-2 length string. It is clear that no two heads could appear consecutively then.
T T T T… T T H H
H T H T… T T H H
T T H T… H T H H
T H T T… H T H H
H T T T ... H T H H
T T H T … T T H H
Notice that counting the valid strings is a lot harder than the previous case. There could be variable number of Ts and HTs and they could be interspersed in various arrangements. But this method should still work; for every valid string there is a unique arrangement of T and HT tiles that makes the string and vice-versa. If we work it out, the number of valid strings comes out to be:

Let us avoid these complex calculations and try to evaluate the expectation in a different way. Assuming the expected value of B exists, we may write it in terms of outcomes of first few toss outcomes:

Here we have partitioned the sample space into different cases. In the first case, the initial two tosses are heads and the game ends after two moves. In the second case, a first head is followed by a tail and we can then assume that the game is in its initial state, only that it would take 2 more tosses that it would have originally taken to end. In the final case, the first outcome is a tail and a similar reasoning follows. Since all our cases are mutually disjoint and exhaustive, we may calculate the expected value to be 6 moves.
Somewhat counter-intuitively we observe that Red team would find its sequence sooner than Blue team on average and is the team that we should bet on. The reason this happens is that once the Red team gets a heads, even if it fails in the next flip, the game remains in the same state and it is still only one move away from ending the game. There is also a numberphile video that explains this.
However, if the rules were changed slightly and both teams continued with the same sequence in case of a tie, both teams should have equal odds to win. Now in case of a tie, blue team is more likely to win.
I experimented with a Monte-Carlo simulation with 1 million coin flips with sequences of length 4. The first plot shows number of occurrences of each sequence, and the second, mean flips required to find a sequence. HHHT, HHTT, HTTT, THHH, TTHH, and TTTH occur most frequently and require least number of flips (around 16) on average; whereas HHHH and TTTT require the most number of flips as expected. The last plot shows the histogram of number of flips required to find HHHT.



In the final plot, we plot the minimum, maximum and average mean flips required for sequences of different lengths. Starting with 2 length sequence as in the case of the problem and going up to 11 length sequences where mean flips required are over a thousand.

