Using the Monte Carlo Method to Better Estimate Outcome

Orin Conn
Analytics Vidhya
Published in
5 min readJan 4, 2021
Photo by Carl Ibale on Unsplash

If you are faced with a decision, where the possibility of different outcomes presents a substantial risk that cannot be avoided, there is only so much you can do to calculate for such uncertainty. An example of such an instances could be predicting the weather. When experimentation for so many variables is too impractical, one method for facing such uncertainty is called the Monte Carlo method.

The Monte Carlo method was formally named by scientists working on the Manhattan Project (the atomic bomb) in the 1940’s, where their research models of nuclear physics were far too complex, and had too many dimensions for standard analysis. Imagine creating one of the most dangerous weapons known to man… there is no room for miscalculation or error there. They devised a way to approach the uncertainty probabilistically, using random sampling from a population of samples to simulate outcomes. The idea is that of inferential statistics and the central limit theorem: the random samples will exhibit the properties of the population. Like a series of coin flips to determine the probability of heads or tails, except the series of probable events here dealt with nuclear fission that could kill hundreds of thousands of people. How pleasant. They decided to name it after the Monte Carlo casino in Monaco because a relative of Stanislaw Ulam, the naming scientist, had a gambling problem.

An important note here is that creating such a model requires information that is relative to the project at hand. This published paper by Robert L. Harrison defines important key points:

- what are our desired outputs?

- what will these outputs be used for?

- how accurate/precise must the outputs be?

- how exactly can/must we model?

- how exactly can/must we define the inputs?

- how do we model the underlying processes?

For any variables which can take a range of possible values, one would simulate all permutations of such combinations in the model, and randomly sample from that. These numerous iterations of permutations in the sample space should present results that get closer to the real probability. This is basically the law of large numbers. The more times you run the simulation, the more the results will hone in on the real probability of occurrence.

A very understandable example can be noticed in a simulation of roulette, a common casino game and one that my college roommates and I were once naively convinced we could crack (the infamous Gambler’s Fallacy). Let’s work out the code for an Object-Oriented roulette game from an MIT course in Data Science led by Professor John Guttag. (Why spend extra time re-writing code when great code is already written? The only rule is you should know exactly what is going on.)

Below the class defines a fair roulette game, meaning the expected return should be 0(%). Let’s run the game 1,000,000 times, and sample from the space to get an optimal idea of the average return, simulating the Monte Carlo method.

class roulette():
def __init__(self):
self.pockets = []
for i in range(1, 37):
self.pockets.append(i)
self.ball = None
self.pocketOdds = len(self.pockets) — 1
def spin(self):
self.ball = random.choice(self.pockets)
def betPocket(self, pocket, amount):
if str(pocket) == str(self.ball):
return amount*self.pocketOdds
else:
return -amount
def __str__(self):
return ‘Roulette’

def play_roulette(game, numSpins, pocket, bet):
totPocket = 0
for i in range(numSpins):
game.spin()
totPocket += game.betPocket(pocket, bet)
print(numSpins, ‘spins of’, game)
print(‘Expected return betting’, pocket, ‘-’, str(100*totPocket/numSpins) + ‘%\n’)
return (totPocket/numSpins)
game = roulette()
for numSpins in (100, 1000, 10000, 100000, 1000000):
for i in range(3):
play_roulette(game, numSpins, 5, 1)

So here, our sample population is really range(1000000), yet we just want to know the returns for 3 rounds of 100 spins, 1,000 spins, 10,000 spins, 100,000 spins, and 1,000,000 spins.

Results:

100 spins of Roulette
Expected return betting 5 = -64.0%

100 spins of Roulette
Expected return betting 5 = 44.0%

100 spins of Roulette
Expected return betting 5 = 8.0%

1000 spins of Roulette
Expected return betting 5 = 4.4%

1000 spins of Roulette
Expected return betting 5 = 11.6%

1000 spins of Roulette
Expected return betting 5 = -2.8%

10000 spins of Roulette
Expected return betting 5 = -6.4%

10000 spins of Roulette
Expected return betting 5 = 3.68%

10000 spins of Roulette
Expected return betting 5 = -5.68%

100000 spins of Roulette
Expected return betting 5 = 1.484%

100000 spins of Roulette
Expected return betting 5 = -0.712%

100000 spins of Roulette
Expected return betting 5 = -1.864%

1000000 spins of Roulette
Expected return betting 5 = 0.8684%

1000000 spins of Roulette
Expected return betting 5 = -0.2116%

1000000 spins of Roulette
Expected return betting 5 = 0.1448%

Our average return after 3 iterations of 100 spins was -4.0%.

Our average return after 3 iterations of 1,000 spins was 4.399%.

Our average return after 3 iterations of 10,000 spins was -2.800%.

Our average return after 3 iterations of 100,000 spins was -0.364%.

Our average return after 3 iterations of 1,000,000 spins was 0.408%.

If we didn’t run so many games, the perception of average return would be different. If we ran 3 iterations of just 100 spins, one could assume the return was -4%. If we ran until 10,000 spins, we would average -4, 4.399, and -2.88 for an estimated -0.827% return.

We can see here that after more and more iterations through a massive amount of combinations of games, our expected average return slims down towards 0% with smaller and smaller variance. We can then infer that the more experiments we run, the probability of our outcome, % return in our case, is in fact 0%.

This is the idea of the Monte Carlo simulation! Our example was extremely simple, however in the case where one does not know the probability of a certain outcome, ever combination is iterated through thousands of times and the results are averaged. I hope you were able to grasp the concept of the Monte Carlo method and can understand how it can be helpful in determining unknown risk.

If you have any suggestions or questions regarding this method, reach out! I am always looking to learn and improve.

Orin

--

--

Orin Conn
Analytics Vidhya

I’m a recent Data Science graduate with a B.S. in Environmental Science. Currently seeking job opportunities. Constantly learning!