Replicating the Monty Hall Problem Through Code

7 min readMar 21, 2017

If you’re a fan of probability (or old game shows), you may be familiar with the “Monty Hall” problem. This problem takes its name from the host of the original iteration of the game show “Let’s Make a Deal” in the 1960s and 70s.

The signature component of this show involved contestants being presented with three doors (Door #1, Door #2, and Door #3). The contestant is told that one of the doors contains a high-value prize (such as a new car), while the other two doors conceal “Zonk” booby-prizes such as small farm animals or junk furniture.

The contestant is asked to pick which of the three doors (e.g., Door #1) they believe conceals the high-value prize. After the contestant selects their initial door, one of the other two doors (e.g., Door #2) is opened to reveal a “Zonk.” The contestant is then asked whether they would like to keep their initial selection (Door #1), or switch their pick to the remaining unchosen door (Door #3) in an attempt to secure the high-value prize.

The contestant’s conundrum is this: do you keep your initial door, or do you switch? Does it affect your chances of winning?

Contrary to common intuition, the contestant has a sizably higher chance of winning the high-value prize if they switch their door. Why this occurs relates to the initial assumptions we have upon selecting our initial door, compared to the assumptions we have upon one of the Zonk doors being opened.

This video provides perhaps the clearest explanation I have seen to date of why your chances of winning are consistently higher if you switch doors compared to keeping your initial selection.

As you can see, on average, contestants have approximately a 1/3 chance of winning the high-value prize if they keep their initial selection, and approximately a 2/3 chance of winning if they switch.

This is all well and good from a theoretical standpoint, but is there a way that we can simulate the Monty Hall problem in a way that supports these probabilistic expectations?

Yes, there is! While playing a single game of the Monty Hall problem doesn’t guarantee that switching one’s door pick will result in a win condition, if we simulate the Monty Hall problem over many, many games, our distribution of results should approach the expected 1/3 chance of winning if “keeping” our initial door pick, and the 2/3 chance of winning if “switching” our pick.

Recreating the Monty Hall Problem in Code

I used Python in constructing my game, and wanted to remain as true to the physical conditions of the Monty Hall problem when assembling my approach.

I constructed a function that took three arguments:

the number of Monty Hall simulations to run
an initial door pick (I used doors A, B, and C, rather than 1, 2, and 3)
a determination to either keep or switch the initial door pick (K or S)

The code for my function looks as follows. Note, I’ll present the entire block of code first, and then break out chunks to explain how each piece works:

import numpy as npdef monty_hall_game(num_games, door_pick, keep_switch): 
    wins = 0   
    door_pick = door_pick.lower() 
    keep_switch = keep_switch.lower() 
    door_set = [“a”, “b”, “c”]
    for n in range(0, num_games):
        open_door_set = [“a”, “b”, “c”] 
        unchosen_door_set = [“a”, “b”, “c”] 
        unchosen_door_set.remove(door_pick) 
        win_door = np.random.choice(door_set, 1) 
        if door_pick == win_door: 
            open_door_set.remove(win_door)
        else:
            open_door_set.remove(win_door) 
            open_door_set.remove(door_pick)
        open_door = np.random.choice(open_door_set, 1) 
        unchosen_door_set.remove(open_door) 
        if keep_switch == “k”: 
            if door_pick == win_door: 
                wins += 1 
        if keep_switch == “s”: 
            if unchosen_door_set[0] == win_door: 
                wins += 1
 return float(wins)/float(num_games)

Okay, this might look like a lot at first, but let’s break it down into its components and explore how this works before running simulations.

import numpy as np

We import numpy because we will rely on some of its random choice functions as part of our function’s logic.

def monty_hall_game(num_games, door_pick, keep_switch):

This is our function. If we want to run 10,000 simulations of the game where our initial pick is Door A, and we choose to Keep our initial door for each simulation, we would run full code block above in python, and then type something like:

monty_hall_game(10000, "a", "k")

With that out of the way, the next several lines of code set up a number of items we’ll want track over the course of our simulations:

Setting Initial Conditions

    wins = 0   
    door_pick = door_pick.lower() 
    keep_switch = keep_switch.lower() 
    door_set = [“a”, “b”, “c”]

“Wins” serves to count the number of times the user’s arguments result in a win over the total number of simulations that are run. That is, how many times out of 10,000 games would a user win if they choose door A and always keep their initial pick? For obvious reasons, we start this counter at zero. Notice that the win counter exists separately outside of running each individual simulation, as we want to retain the total number of wins across the entire range of our simulations.

The “.lower()” function will ensure that the user’s door pick and keep/switch condition are converted to lowercase before simulations begin. This is more of a user-friendliness component of the function than anything else, so that the function will work whether the door input is “A” or “a”, etc.

Finally, we are defining our initial door set to include all available doors (A, B, and C)

Setting Up Each Game and Choosing the Win Door

    for n in range(0,num_games): 
        open_door_set = [“a”, “b”, “c”] 
        unchosen_door_set = [“a”, “b”, “c”] 
        unchosen_door_set.remove(door_pick) 
        win_door = np.random.choice(door_set, 1)

Now we can begin simulating the Monty Hall Problem. So, for every game between 0 and the number of simulations (10,000), we want to make sure certain conditions are reset at the beginning of each game.

For each game played, we initially set all doors (“A”, “B”, and “C”) capable of being opened (the open_door_set). We also want to create a list of all doors that might be the unchosen door — that is, the door that is neither picked by the user, nor opened to reveal the Zonk.

At this point in the game, we can safely remove the user’s door pick from the unchosen door set because, by definition, the user’s door pick cannot be the unchosen door. Thus, “unchosen_door_set.remove(door_pick)”

Finally, we allow our code to randomly select one of the three doors in our door set as the “win_door”.

Determining the “Open” Door and the “Unchosen” Door

        if door_pick == win_door: 
            open_door_set.remove(win_door)
        else:
            open_door_set.remove(win_door) 
            open_door_set.remove(door_pick)
        open_door = np.random.choice(open_door_set, 1) 
        unchosen_door_set.remove(open_door)

The next section of code is dedicated to selecting both the door to open and the remaining unchosen door. How this works depends on whether the user happened to pick the “win” door in any given simulation.

If the user picked the win door, then either of the other two doors could be the door that is opened. So, if the user picked the win door, only the user’s door pick is eliminated from the open door set (leaving Door B and Door C in the open door set).

If the user did not pick the win door, the only one of the two remaining doors can be opened (because the “host” cannot open the user’s pick or the win door). If this is the case, then both the user’s pick and the win door are eliminated from the open door set. If the user picked Door A and the win door is Door C, then only Door B remains the open door set.

For however many doors remain in the open door set (either two or one), one door is randomly selected from any remaining as the door to open.

The “open” door is then removed from the unchosen door set — leaving one “unchosen” door.

At this point, the game is fully set up — we have 1.) the user’s door pick, 2.) a Zonk door to open, and 3.) one remaining unchosen door for purposes of either keeping or switching.

To Keep or To Switch?

        if keep_switch == “k”: 
            if door_pick == win_door: 
                wins += 1 
        if keep_switch == “s”: 
            if unchosen_door_set[0] == win_door: 
                wins += 1
 return float(wins)/float(num_games)

The final step now hinges on whether the user decided to keep or switch their initial pick.

If the user chose to keep their initial pick, and their pick was the win door, then they win! The wins counter at the top of the function counts up by one.

If the user chose to switch their initial pick, their new pick becomes the unchosen door. If the unchosen door was the win door, then the user wins! The wins counter at the top of the function counts up by one.

At the end of all simulations, the function will return the total number of wins divided by the total number of games — i.e., the user’s total win percentage.

Examining the Results over 10,000 Simulations

I ran 10,000 simulations of the Monty Hall game keeping my initial door pick. I then ran 10,000 more simulations of the Monty Hall game switching my initial door.

monty_hall_game(10000, "a", "k")

When keeping my initial door each time, my win percentage was 33.23%

monty_hall_game(10000,”a”,”s”)

When switching my initial door each time, my win percentage was 66.77%

Just as we would have expected!

I particularly liked the way I coded this because, if you want to, you could program the function to return a list of the results of each game — showing the door pick, the win door, the open door, and the unchosen door, as well as the whether the user won. This way, you could manually go over the results of each of the simulations to verify the correct win percentages.

Pretty neat stuff!

While the Monty Hall problem may be a common example used in general probability courses, it was really interesting to generate a function that could demonstrate computationally why switching doors was a more lucrative option than keeping one’s initial pick.

I hope you found this interesting and informative!