Do open that door!
The Monty Hall problem is a well-known probability puzzle: there are three closed doors. Behind one door is a car (let’s say a Ferrari, sorry, I’m Italian), while behind the other two is a goat. You have to choose one, then one of the remaining two is opened and will show you a goat. At that point, you can change your mind, and switch to the other door, or stick with your decision. Should you switch?
Intuitively, one would say there is no difference in switching. You have two doors, one winner and the other loser, so 50–50. But it’s not. You have to change the door, and this will give you a 66,67% chance to win. On the other hand, sticking will win only 33,33% of the time.
I used a simple Monte Carlo simulation to verify it. Since it’s a random process, we need to run many iterations to get close to the theoretical values (even if, given the broad difference, a few thousand would be enough). But it was an excellent occasion to use Numba and Julia to speed up the process and compare the results. We will also learn some helpful tricks and discover some surprising behaviors.
Python
The Python code is straightforward. We initialize some variables to store the results and use random.choice to select the doors’ distribution. Then we will run our simulation in a for loop.
import numpy as np
from numba import njit, prange
@njit(cache=True)
def door_mc(n_iter=10_000):
# Initialize variables
d1_stat = np.empty((n_iter), dtype=np.int8)
d2_stat = np.empty((n_iter), dtype=np.int8)
d3_stat = np.empty((n_iter), dtype=np.int8)
switch = np.empty((n_iter), dtype=np.int8)
no_switch = np.empty((n_iter), dtype=np.int8)
choices = np.array([1, 0, 0]) # 1=Ferrari, 0=Goat
# Main loop
for i in prange(n_iter):
# Get doors
d1,d2,d3 = np.random.choice(choices,size=3, replace=False)
# Store results
d1_stat[i] = d1
d2_stat[i] = d2
d3_stat[i] = d3
# Get rid of one goat
if d2 == 0:
other = d3
else:
other = d2
# Store stragey outcome
no_switch[i] = d1
switch[i] = other
return d1_stat.mean(), d2_stat.mean(), d3_stat.mean(), switch.mean(), no_switch.mean()
On my PC, Intel i7 with 4 cores/8 threads and 32 GB of RAM, I ran 50 million iterations:
d1, d2, d3, s, ns = door_mc(50_000_000)
print(f"Percentage d1 wins = {d1:.2%}\nPercentage d2 wins = {d2:.2%}\nPercentage d3 wins = {d3:.2%}\nPercentage Switch wins = {s:.2%}\nPercentage No Switch wins = {ns:.2%}")
The output, given the significant number of simulations, is close to the theoretical values:
Percentage d1 wins = 33.33%
Percentage d2 wins = 33.33%
Percentage d3 wins = 33.34%
Percentage Switch wins = 66.67%
Percentage No Switch wins = 33.33%
We can use the magic command %time to obtain the computing time (I am running in JupyterLab):
%time d1, d2, d3, s, ns = door_mc(50_000_000)
CPU times: user 16.4 s, sys: 43.3 ms, total: 16.5 s
Wall time: 16.5 s
Numba took 16.4 seconds to complete the task. It’s interesting to see also the RAM usage. For that, I installed the IPython-memory-magic.
%memory -q d1, d2, d3, s, ns = door_mc(50_000_000)
RAM usage: line: 2.38 KiB / 238.42 MiB
Numba is highly efficient and uses just 238.42 Mega Bytes. Please note that I used int8 arrays to save memory (they will store 0 or 1).
Julia
Julia's code is really similar to Python:
using StatsBase
using Statistics
using Printf
using Base.Threads
function door_mc(n_iter::Integer)
# Initialize variables
d1_stat = Vector{Int8}(undef,n_iter)
d2_stat = Vector{Int8}(undef,n_iter)
d3_stat = Vector{Int8}(undef,n_iter)
switch = Vector{Int8}(undef,n_iter)
no_switch = Vector{Int8}(undef,n_iter)
choices = Array{Int8}([1, 0, 0]) # 1=Ferrari, 0=Goat
# Main loop
Threads.@threads for i in range(start=1, stop=n_iter, step=1)
# Get doors
d1,d2,d3 = sample(choices, 3, replace=false, ordered=false)
# Store results
d1_stat[i] = d1
d2_stat[i] = d2
d3_stat[i] = d3
# Get rid of one goat
if d2 == 0
other = d3
else
other = d2
end
# Store stragey outcome
no_switch[i] = d1
switch[i] = other
end
return mean(d1_stat), mean(d2_stat), mean(d3_stat), mean(switch), mean(no_switch)
end
The results are, as expected, the same:
d1, d2, d3, s, ns = door_mc(50_000_000);
@printf "Percentage d1 wins: %2.2f%%\nPercentage d2 wins: %2.2f%%\nPercentage d3 wins: %2.2f%%\n" d1*100 d2*100 d3*100
@printf "Percentage Switch wins: %2.2f%%\nPercentage No Switch wins: %2.2f%%\n" s*100 ns*100
Percentage d1 wins: 33.33%
Percentage d2 wins: 33.34%
Percentage d3 wins: 33.33%
Percentage Switch wins: 66.67%
Percentage No Switch wins: 33.33%
What about time and RAM usage?
@time d1, d2, d3, s, ns = door_mc(50_000_000);
2.492081 seconds (100.00 M allocations: 6.938 GiB, 38.10% gc time)
Julia is much faster (2.49 seconds vs. 16.4 using Numba; 6.58X), but the RAM usage is massive: 6.938 Giga Bytes vs. 238.42 Mega Bytes (29X). Please note, that in this case, Julia used only four threads. In fact, in Jupyter, you must start the Julia kernel, indicating how many threads you want to use.
Memory-optimized versions.
We may change our function, and avoid using arrays to store each result. This should save memory, but what about execution time?
@njit(cache=True)
def door_mc2(n_iter=10_000):
# Initialize variables
d1_stat = 0
d2_stat = 0
d3_stat = 0
switch = 0
no_switch = 0
choices = np.array([1, 0, 0]) # 1=Ferrari, 0=Goat
# Main loop
for i in prange(n_iter):
# Get doors
d1,d2,d3 = np.random.choice(choices,size=3, replace=False)
# Store results
d1_stat += d1
d2_stat += d2
d3_stat += d3
# Get rid of one goat
if d2 == 0:
other = d3
else:
other = d2
# Store stragey outcome
no_switch += d1
switch += other
return d1_stat/n_iter, d2_stat/n_iter, d3_stat/n_iter, switch/n_iter, no_switch/n_iter
In this case, for the same 50 million simulations, Numba took 16.7 seconds but used just 13.42 Kilo Bytes!
In Julia, instead, the code becomes:
function door_mc2(n_iter::Integer)
# Initialize variables
d1_stat::Int64 = 0
d2_stat::Int64 = 0
d3_stat::Int64 = 0
switch::Int64 = 0
no_switch::Int64 = 0
choices = Array{Int64}([1, 0, 0]) # 1=Ferrari, 0=Goat
# Main loop
for i in range(start=1, stop=n_iter, step=1)
# Get doors
d1,d2,d3 = sample(choices, 3, replace=false, ordered=false)
# Store results
d1_stat += d1
d2_stat += d2
d3_stat += d3
# Get rid of one goat
if d2 == 0
other = d3
else
other = d2
end
# Store stragey outcome
no_switch += d1
switch += other
end
return d1_stat/n_iter, d2_stat/n_iter, d3_stat/n_iter, switch/n_iter, no_switch/n_iter
end
It gives the same outcome in 5.98 seconds (in single thread mode), using the same amount of RAM (6.938 Giga Bytes). It’s worth noting that on my version (1.7.4), the multi-threaded version took 6.99 seconds, using a whopping 11.176 Giga Bytes of RAM, and gave unreliable results. I don’t know if it’s a bug or if I misused the Thread module. I will verify and open a ticket if needed. What seems clear, is that Julia works better with arrays.
In a future post, I will try Cython and Mojo.