Using Monte Carlo to Answer a Blackjack Question
A few nights ago, my brother sent me a snapchat of a Blackjack strategy chart. He asked a pretty reasonable question:
The situation is we have 7, 7 and the dealer has a 10. Stand on a pair of 7s against a 10? My initial reaction is the same as his — how on earth is this correct?
Let’s weigh all the options we have here. We can either split 7s, hit, or stand. I don’t really see much of an argument for splitting. If we split, we have two separate 7's facing a 10, both obviously expecting to lose money. One of the first things a blackjack player will tell you is to guess that the dealer’s hidden card is a 10. So, in this case, we would be up against a 20. Twice. For this reason, we don’t want to put any more money on the table by splitting! Also, just from looking at the chart alone, no neighboring cell suggests to split. Splitting is off the table.
The other two options are either stand on 14 (as the chart suggests) or “hit me dealer.” If we hit, it is still unclear how long we should keep hitting. If we draw an ace, do we hit on 15? What about 16?
All of these questions can be answered with a branch of mathematics called combinatorics . Combinatorics is defined as the study of finite countable, discrete structures. It deals with evaluating all possible combinations of what could happen — what exists. As you can imagine, this gets quite complicated for blackjack. For example, let’s say we wanted to solve for the expected ROI for standing on 14. We would have to account for every possible distinct order of cards drawn from the shoe. This is not as challenging to calculate when we stand on 14 (7,7) since we only need to consider what can happen with the dealer’s holdings. To calculate the true value for standing on 14, we would take the mean result of all possible draws.
If we are dealing with a shoe of 6 decks, we have to consider 309 (52*6–3) remaining cards and all possible orders that they can be drawn. There are a lot of multiplicities, which makes it simpler, but even thinking about how to group the multiplicities is a doozy. Imagine how difficult the combinatorics gets when we decide to only stand on 17+.
My brother is a math major — I’m sure he is interested in seeing the dirty work. But, I want to give an easier, quicker solution that anyone can find! Cue the drumroll… let me introduce the Monte Carlo approach.
The Monte Carlo method gives a numerical approximation for a true value. The fundamental idea is if we randomly simulate an event many times, the resulting sample mean will approximate the true mean. In the context of our problem, this means that we can be confident that the true value for each decision is within some margin of error of the numerical approximation we generate from running a Monte Carlo simulation.
Simulating a Coin Flip
Let me give a more simple example for why it works. Let’s consider flipping a coin. Here is a function I wrote in python to simulate a coin flip:
A heads is 1 and a tails is 0. The true mean value of our distribution is clearly .5 (50%*0 + 50%*1). Let’s say theoretically we didn’t know that we were supposed to get heads 50% of the time. Then, we could simulate the problem a large number of times, and be confident that the true mean is close to the sample mean we calculated from Monte Carlo.
The numerical approximation for the coinflip is ~.497 meaning we simulated landing on heads 49.7% of the time.
Monte Carlo is randomly sampling from a distribution several times. Typically, we don’t know what that distribution of results looks like (such as the blackjack problem), but the coin flip example is a good way to grab an intuition for why this method is so useful.
We can apply the same type of logic to the blackjack hand. I wrote a python script that simulates the blackjack scenario described above, given a decision set. The first decision we can try is to stand on 14. Using Monte Carlo simulations, we can then approximate the ROI on this decision. Another decision is to stand on 15. We can do this all the way up to standing on 20, and see which strategy results in the best return on investment. Spoiler alert, they will all lose money — the question is which one will lose the least amount of money. It is a little bit more challenging then simulating a coin flip, but still clearly much easier than solving the problem combinatorially. I’ve attached a link to a Github repo I made to share this script.
Here are the results:
In this example, I use 6 decks. This shows that if we simulate standing on 14 1 million times, the average result is losing .535 units. So if we have $10 on the felt, we expect to lose $5.35.
If we change the decision point to 15, we see an improvement in performance. This time, we only lose .476 units on average.
Here, we see a very similar result. It is inconclusive whether it is better to stay on 15 or 16 from our simulation. This is because there is a small probability that we even have 15 in our hand given our starting cards 7,7. We would have to draw an ace exactly to have a decision at 15. Still, it appears that 16 is a slightly better standing point.
Again, a very close result! Moving the decision point to 17 seems to have a slight edge over 15 and 16. But, standing on 14 is simply not correct with six decks.
For fun, I calculated the approximations up to standing on 20.
Important Note: The reason that moving the decision point from 15–17 has little effect on the average units lost is due to the fact that there is a tiny difference in these decisions. The simulation is set up as two 7s vs a ten. When we hit on 14, we only have to worry about the 15 decision point if we draw an ace. We only worry about 16 if we draw two aces in a row, or a deuce. These things don’t happen very often. Though we have evidence suggesting 17 is the best standing point from the approximations, the difference in sample means for the decision points might be inconclusive.
I mentioned that Monte Carlo gives a numerical approximation for the true mean of a distribution. Intuitively, the more times we simulate something, the closer our sample mean is to the true mean [on average]. So, just how close are we to the true expected value for each decision point?
Well, we have no way of knowing for sure. Everything was randomly determined by a python module called… Numpy.Random. It is possible that, by chance, the result obtained is not that close to the true mean. However, we can develop a confidence interval around the sample mean. This means we can be confident (to different degrees) that the true value lies in some interval (of different magnitudes) of the numerical approximation obtained from Monte Carlo.
In the next post, I discuss the Central Limit Theorem, how we can apply it to build these confidence intervals, and also why the chart might have suggested to stand on 7,7! Read on, brother.