The Intractability of Certain Combinatorial Problems in Poker

When Counting Things in Poker is Too Hard

Zynga Poker Engineering

Published in

Zynga Engineering

13 min readOct 15, 2018

By Jarred Simmer | Principal Software Engineer
October 15, 2018

A Simple Question

In a game of poker, particularly in online poker, players are wary of whether the game is fair or, as so often believed, “rigged”. This is a natural and even wise concern to have. I would venture that nobody likes the feeling of having been cheated, particularly where money or large investments of time are concerned. This concern is a protective mechanism that shields us from harm to our time or our bankroll.

There are several elements of “fairness” in poker, though: whether everyone has the same amount of time with which to act, whether one player has additional information that is not available to everyone, or whether the cards are truly dealt randomly. Each of these elements deserves their own examination, but for this entry I’ll be focusing on the last of these problems: How do we show that a deck is shuffled randomly?

Enumerating Shuffles

When you shuffle a deck of cards in real life, what you’re actually doing at the end of the shuffle is picking 1 of 52 cards to be the top of the deck, then picking 1 of 51 remaining cards to be the second from the top, then 1 of 50 remaining cards to be the third from the top, all the way until you only have 1 card left and you place it on the bottom of the deck.

This means that you have 52 possibilities for the first card, then 51 possibilities for the second card, then 50 possibilities for the third card, etc. down to one. Your total ways you can shuffle the deck, then, is 52 times 51 times 50 etc. down to one. This is known as 52! (52 factorial). This equates to roughly 8 x 10⁶⁷. Which is a big number. A really big number. A really, really big number. SO big, in fact, that it deserves an explanation of its size before we move on.

Illustrating 8 x 10⁶⁷

This illustration is taken from this YouTube video. It’s an excellent video and I highly recommend watching it.

Let’s say you have a computer. That computer is capable of producing 1 unique shuffle it has never before generated once per millisecond, or 1000 times per second.

Now, we start the computer up. While it’s working, you stand on the equator and wait. One million years. Then you take a single step east, and wait, again, one million years. Then you take another step, wait a million years, and another step, wait a million years, until finally you walk the entire circumference of the Earth and end up back where you started.

And then… You take a single drop out of the Pacific Ocean. Then you take a step, wait a million years, take another step, wait a million years, repeating until you end up back in your starting position at which point you take a second drop out of the Pacific Ocean. Then you repeat that whole process, walking around the earth incredibly slowly, taking a drop out of the Pacific Ocean, until finally, at the end of all of it, the Pacific Ocean is completely dry.

And then… you instantly fill the ocean back up and put a single piece of paper down on the ground. Then you start your slow march again, gradually draining the ocean until it’s empty, at which point you again refill it and put a second piece of paper down on the ground. Now, you repeat that entire process until eventually this stack of papers you’ve been slowly growing finally reaches the sun.

At that point, can you give an educated guess as to how many unique shuffles your computer still has yet to generate? As it turns out, you still have roughly 8 x 10⁶⁷ shuffles left to generate. That’s right — you’ve barely generated any in comparison to how many you have left — roughly .032%. You will need to repeat the entire process of walking the earth, draining the ocean, and stacking the papers to the sun 3,000 times before your computer will have exhausted all unique shuffles.

A Big Problem

One aspect of determining if a shuffler is truly random is to ensure that its shuffles appear an equal number of times over a given period. Before solving for that aspect though, let’s tackle a different problem to see where it takes us: Knowing that 52! is such a massive number, what is the probability when you shuffle a deck of cards that no deck of cards has ever been shuffled into exactly that arrangement?
In order to tackle this question, we’ll make a few simplifying assumptions

Playing cards in their current state have been around for approximately eight centuries (800 years)
Decks of playing cards were shuffled into a random configuration on average ten billion times per day over that period of time
During that time, decks were never shuffled the same way more than once.

This gives us 2.922 x 10¹⁵ unique shuffles over that period of time.

Now we shuffle our own deck of cards, resulting in 1 of 52! possible arrangements.

Given these two facts, we then know that the probability that our particular ordering of the cards has NOT occurred in the past is
(1–1/(52!))^(2.922×10¹⁵).

When you type this into Google, you see that it just gives up and rounds to 1. I ended up using this tool to get an exact answer. It comes out to be this number:

0.99999999999999999999999999999999999999999999999999996377304692200000000000000000000000000000000000000000065619606465780661247113598295453899999999999999999999207600531855780066488211132633065168698062

Whenever you shuffle a deck, it is, as it turns out, a near certainty that you end up with an arrangement that has never before been generated.

This has an interesting implication. If whenever you shuffle a deck it is a near certainty that you will end up with an arrangement that has never before been generated, and there are more arrangements than you could possibly generate in a lifetime, how can you verify that a shuffle is completely random? Even if you verify that every arrangement the shuffling algorithm has ever generated is unique, how can you be certain that, for example, the shuffler does not always arrange the deck such that the queen of hearts is not on the bottom?

Another Big Problem

For the sake of argument, let’s say that you can know with certainty that each arrangement of cards is actually equally likely to be generated. To visualize another aspect of randomness verification, let’s visualize each unique shuffle as a color (or for the color blind, a letter).

Each color here represents a unique ordering of cards, each with an equal expected number of occurrences over a given period.

Let’s suppose our shuffler runs, and runs, and runs, and runs, until somehow we actually get through all 52! possible arrangements, each appearing once. Looks good, right? But then we keep generating arrangements, and we might see something like this.

Visualizing Orderings of Unique Shuffles

Though each arrangement of cards appears an equal number of times and each arrangement is itself random, the arrangements of these arrangements may not be random. With no information other than what shuffles have appeared in what order, once the sequence begins to repeat you would be able to predict the next shuffle, making these shuffles not truly random.

A Common Attempt at a Solution

So far, these discussions of randomness have been more theoretical than practical. After all, there’s no way you could get through all possible shuffles in the first place in order to see if they appear with equal probability, let alone if they appear in a recurring pattern. So, how do people verify randomness in practice?

One method is to take the random number generator used for shuffling and randomly generate X and Y coordinates of a point and plot it on a graph. Repeat this enough times and eventually you might see a pattern emerge through simple visual inspection.

Scatter Plot of Randomly Generated X, Y coordinates (1)

In this case, numbers on the low and high end of the range are far less likely to be generated than those in the middle of the range.

This is another pattern you might see. Do you know what’s wrong with a number generator that causes this?

Scatter Plot of Randomly Generated X, Y coordinates (2)

Since we are generating numbers like this: X1, Y1, X2, Y2, …. and we see strong relationships between Xn and Yn, the problem is that each number is more likely to be close to its predecessor than it is to be far away.

Visually inspecting these plots, it’s easy to discern a pattern. Now, we’ll try a short experiment. Given the two graphs below, which one do you think is more random?

If you are like the majority of people I gave this test to in our office, you’ll have chosen the graph on the left. However, the graph on the right was actually the one randomly generated. The left graph is made up of points I manually entered.

This experiment highlights the main problem with the plotted point approach to answering the question of whether or not a random number generator is actually random: people are so good at recognizing patterns that we look for and find them even when they are unintentional. Furthermore, we attribute meaning to those patterns. In this case, the graph on the right has more dots in the bottom left than it does in the top right, in turn lending it to be perceived as a triangle shape. Because I had just shown the previous graph with a similar bias issue, you assumed that this graph was also biased.

Problems with Similar Solutions

This is not a problem with this solution just in this limited example, though, nor is it only through visual inspection that we find issues with pattern recognition in determining randomness — it’s also auditory and on a large scale.

Spotify ran into a problem where even though they were completely randomly shuffling music (on a playlist, for example) the fact that songs by the same artist or even on the same album appeared back to back as a result of this shuffling caused their users to feel like the shuffling was broken because they detected a pattern, even though that pattern was itself completely randomly generated.

Poláček, L. (2014, February 28). How to shuffle songs? [Web log post]. Retrieved October 9, 2018 from https://labs.spotify.com/2014/02/28/how-to-shuffle-songs/

Apple encountered the exact same problem with its shuffling algorithm. They fixed it by actually making the shuffling less random in order to make it feel more random.

…true randomness is likelier to produce repetition than is a chain of totally unrelated selections. As Apple tries to fix this, “we’re making it less random to make it feel more random,” Mr Jobs says.
- Maslin, J. (2006, October 19) His Heart Belongs to (Adorable) iPod [News Article] Retrieved October 9, 2018 from https://www.nytimes.com/2006/10/19/books/19masl.html

Taking this back into the casino mindset for a moment, an example of people’s pattern recognition and attribution appears on the casino floor. One of the simplest and most effective ways is through the roulette display board. These boards display the recent past numbers rolled on the roulette wheel, including the color. What this causes people passing by to do is recognize a pattern. In the image on the right, “Hey, there have been a lot more black numbers than red numbers.” People who misunderstand how probability works then think to themselves “Red and black numbers are supposed to appear an equal number of times because they’re equally likely. There are far fewer red numbers that have appeared, so a red number must be due.” This faulty logic is known as the Gambler’s Fallacy. People also misunderstand a principle of probability known as the Law of Large Numbers and arrive at the same conclusion. I’ll leave it up to you to learn more about the Gambler’s Fallacy and the Law of Large Numbers on your own.

A Better Approach

The takeaway from all these examples is that people are not reliable when it comes to manually inspecting for randomness. And we already know that given a large enough problem space no computer exists that could generate let alone verify all elements of randomness. So as a software engineer, what can you do?

Math. Rather than try to observe randomness in the outcome, we can prove that the method we use to generate numbers is random.

One method that has been proven to produce random sequences of numbers given proper parameters is the pseudorandom Linear Congruential Generator. This is, for example, how Java’s java.util.Random class works. To generate a number, you take a predefined constant “a”, multiply it by the previously generated number, add a predefined constant “c”, and take that value modulo some predefined constant “m”.

Pseudorandom Linear Congruential Generator

Because a number is recursively defined, we need to have an initial number sub i. This value is known as the “seed”. It’s a number that you can generally either specify yourself (mostly useful for debugging) or have the algorithm implicitly provide. When implicitly provided by the algorithm, it is usually either time based (such as milliseconds since the computer started) or derived from environmental noise (such as measurements from hardware like cameras or thermometers). Some of the more extreme, more secure seeds are something unpredictable, such as the precise combination of radioactive isotopes monitored for a short period of time that have undergone radioactive decay. Such a generator can be found here.

The choices for the three constants are obviously very important as well. For example, if “a“ and “c” are both 0, you will always get 0. If “a” is 0 and c is 1, you’ll always get 1 mod “m”. I won’t go into how exact values for “a”, “c”, and “m” are chosen (you can find the values chosen for popular implementations here), but I will briefly talk about the importance of “m”. Because we are modding by “m”, the number we generate will never be higher than “m”. So, if “m” is 2⁶⁴, our random number will never be higher than 2⁶⁴. However, taking Java’s Random as an example, it actually throws out the lowest order 16 bits as their combinations are not actually evenly distributed. This leaves us with 48 usable bits, meaning that the maximum value we can generate once the low order 16 bits are thrown out is 2⁴⁸. This upper limit is known as the “Period” of the generator.

Because we’re generating whole numbers between 0 and 2⁴⁸, after at most 2⁴⁸ iterations we will be guaranteed to repeat a number that we’ve already generated. When that happens, all following numbers in their exact order are known with certainty by anyone who kept track of the numbers already generated, even without knowing “a”, “c”, “m”, or the seed. This takes us back to the problem we discussed earlier of a sequence being eventually repeated.

This is why after some number of iterations less than the period, best practice is to re-seed the generator. That is, restart the random number generator using a new seed. This is a way of ensuring that any pattern that exists in the Linear Congruential Generator will not persist through a re-seeding. Though it does not eliminate the possibility of a pattern emerging, it does make it less likely and in any case limits such a pattern’s length to the number of iterations between re-seeds (where otherwise there would be no limit to its length).

What Does Zynga Poker Do

Zynga Poker prides itself on being a fair and trusted gaming platform. To further this trust, we asked Gaming Labs International (GLI), a leading independent certification agency for the casino industry, to certify the Random Number Generation algorithm used in our game as unbiased. We are happy to share that Zynga Poker is RNG certified by this fully accredited independent third-party lab, demonstrating the security and fairness of our card shuffling and gaming procedures.

GLI is the world’s leading independent testing, inspection and certification laboratory for the gaming industry. It has 21 locations spread across Africa, Asia, Australia, Europe, North America and South America. GLI provides testing and certification services to more than 455 jurisdictions worldwide. For more information please visit www.gaminglabs.com.

As the world’s leading online poker game, game fairness has always been a top priority for Zynga Poker. On occasion, we receive complaints from a small subset of players, typically after a losing streak, that the game is rigged and the results are controlled by the dealer. Even though Zynga has never controlled the games’ outcomes, we take user feedback very seriously. To quell any doubts in the minds of the players, Zynga requested GLI to independently verify the algorithms in Zynga Poker to be Random to assure players about Zynga’s commitment to game fairness.

We want players to be confident that the game is fair and does not favor or disfavor any specific individual or player type. With this certification, you can be confident that your performance in the game is not controlled in any way by the dealer.

Conclusion

It is incredibly difficult to prove that the shuffling of a deck of cards using a computer is completely random as the space in which we are working is too large to completely search in a reasonable amount of time. Human inspection has been used for detecting patterns in small sample sizes of random number generators, but humans have an innate bias during sensory-based inspection. Pseudorandom number generators such as the Linear Congruential Generator can be used to alleviate these problems.

Thank you for for the read! If you found this post interesting or helpful, give it a share. If you have any input you’d like to give us, reach out to us at Poker-Engineering-Blog@zynga.com or shoot me an email at jarredwsimmerengineering@gmail.com.

References:

https://math.stackexchange.com/questions/671/when-you-randomly-shuffle-a-deck-of-cards-what-is-the-probability-that-it-is-a

https://www.youtube.com/watch?v=ObiqJzfyACM

https://www.mathsisfun.com/calculator-precision.html

https://labs.spotify.com/2014/02/28/how-to-shuffle-songs/

http://www.nytimes.com/2006/10/19/books/19masl.html

https://en.wikipedia.org/wiki/Randomness_tests

https://en.wikipedia.org/wiki/Linear_congruential_generator

https://www.fourmilab.ch/hotbits/

The Intractability of Certain Combinatorial Problems in Poker

When Counting Things in Poker is Too Hard

Written by Zynga Poker Engineering