If you are a backgammon lover, or you have been to a casino you probably know how unpredictable dice rolling is. Because of its nature, most games played with dice only involve 6 sided ones. If I ask you how many points on average you expect to have if you rolled such die 6 times? If you have taken some statistics classes in high school or college, your answer will be 3 and it will be “on average” correct. Let’s complicate the problem a bit.
Suppose, you are given two random 8 sided dice. They are hand-made, so sides are not equal. Given that you know the probabilities associated with each side of each die, which one would you choose to score 100 points the quickest?
This one is a bit tough. Don’t rush to get your pen and paper to solve it mathematically. Because we can also do it programmatically. Let’s jump into it.
We are going to use Python as our primary language and NumPy as our simulation tool. NumPy is a Python library that is the barebone of most scientific packages for Python. Almost all Machine learning, AI-related libraries use NumPy either fully or partly as its core. Because it is written primarily in C, it is swift compared to native Python. Let’s jump into the programming part right away!
First, we need to import tools that we are going to use. They are:
- NumPy for simulation
- Seaborn for visualization
- Matplotlib for some control on visualizations
If you have Anaconda distribution of Python installed, you can import those right away! If not, you can install those libraries via
pip install numpy seaborn matplotlib
And import them:
Now, it is time to generate the probabilities (recall that they are purely random). They must sum up to 1 and in order to generate such numbers, we will use
The first argument in that function is
alpha which in this case consists of a vector of ones with a length of 8. The second argument
size is used to determine how many of those probabilities should be generated. We have two dice, so
size = 2 . We can use the tuple unpacking feature of Python to assign every output to a variable. Then, we also need a
sides variable to store our initial scores. It will consist of numbers from 1 to 8 and we use the
np.arange function for that. Note that, when used with two arguments the second argument is not included in the result and it will be a vector of numbers from 1 to 8.
Now comes the core part, namely, the simulation part. First, we will generate 1k games in which both dice are rolled 100 times. The reason behind 1k games is the law of large numbers. According to this law, a large number of repetitions of the experiment will yield the results close to the expected outcome. We roll every die 100 times because the least score we can get is 1. So, rolling it 100 times will result in a guaranteed “finished” game.
np.random.choice will simulate random samples from a given set with given probabilities,
size argument will give us an array with 1k rows (games) and 100 columns (rolls).
After “playing” 1k games we need to calculate the scores for each game.
cumsum function is coming to our help. By default, it will flatten the entire array and just sum everything up. Yet, we need to do it column-wise (from left to right). So,
axis =1 .
Recall that, 100 rolls is the worst scenario to get 100 points and most likely we will reach that score way before that.
np.argmax will give the index of an element which satisfies any given condition, in our case it is the result being greater than or equal to 100. Again, we need to calculate it column-wise. Thus again,
axis = 1 .
Now we can calculate how many times each die wins the “game” and how many ties there are. What we need is simply to compare the results above with each other and then sum them up to get a number. NumPy arrays when being compared will output
True, False values only and because of the boolean logic in Python, we can sum them.
True values will be treated as 1 and
False as 0.
Let’s also visualize our results. Seaborn comes to the play now:
ax2 are two plots that we are going to have on top of each other. In order to distinguish them, one will be red and the other one green.
Die 1 won 112 times
Die 2 won 817 times
Tie games 71
Your results will be different than mine because of randomness. If you want to get the same results as I, after importing libraries, add this line to your code which fixes the randomness.
As you see, we are better off if we pick up the second die. If you look carefully, you will notice some overlap in two plots, which means there is still chance that you may be just unlucky. Because as the French say, c’est la vie. Everything is possible.