The Monte Carlo Simulation using Python!

Pyariksha Tiluk
4 min readJan 5, 2020

--

The Monte Carlo Simulation is something that my masters class was recently tasked to explore as an introduction to data analysis. I must admit, my understanding in the beginning was crude to say the least… BUT… I believe I’ve wrapped my head around the mechanics of running a couple thousand simulations using our favourite reptilian language… No, not Parseltongue 🐍…Python! #Gryffindorforever

Voldemort: “ssshshskkaaasshhh sshkashh sssh” (“Your jokes are weak, just like you“— translated from Parseltongue)

The Investor scenario:

Firstly, the Monte Carlo Simulation needs to be applied to a scenario which contains a dependent variable and multiple parameter variables. The simple case used in our class was that of a $100,000 investment which has been left to grow over a 30 year period, with $10,000 additional payments made in advance each year and with a variable interest rate to which we assigned random probabilities based on normal distribution. The standard deviation and mean interest rate were also provided.

Additionally, the dependent outcome or variable would be the future value that the investor would receive. Given the law of large numbers (the larger the sample size, the closer the sample mean to the population mean), the more simulations that we do, the closer the average of the future value outcomes will be to what we could actually receive at the end of 30 years.

In layman's terms: if we run 10,000 simulations it will be more accurate than running, let’s say, 10 simulations in predicting the average future value.

Coding skills are cool but only if you know what to do with them!

I am a novice but from what I have seen thus far — the thinking behind the code is always more important than the actual writing of the code itself! This sounds simplistically logical but we can fall prey to thinking that complexity and skills accumulation equates to knowledge.

That’s like stockpiling your home shed with various different tools and materials (hacksaw’s, drills, bolts, nails, wood, sheet metal, jackhammers etc. #zerodiyknowledge) only to realize that you don’t know why you need them or how to use them so you end up going IKEA and buying a “plug-and-play” coffee table which is akin to just resorting to open-ware software that can give you a pre-defined process for getting to an outcome… There is, of course, nothing wrong with that but if you’re trying to learn how to think then you need to actually…think… Sounds peculiar I know.

I digress.

I have included a simple diagram showing how I thought about the base calculations and how to structure the code to get the ending values at the end of 30 years:

My thought process for base calculations before running simulations

My thought process is of course neither correct nor incorrect, it is simply personal. Thus, your approach to this problem may be vastly different or similar! Don’t feel restricted — understand the “why” first and then plot your course with your own personal preferences!

The actual code for the programme:

First I started with the variables and imported numpy which will do all the heavy lifting.

As shown above, a rand_return array of 30 possible interest rates was created given the mean and standard deviation. This is the first structure to be created as shown in my thought-process diagram (Thinking process Part 2).

The second structure is the list of ending values that will contain the first year ending value at return_y[0]. All subsequent values are to be added per iteration of the remaining 29 returns in the rand_return array (in the diagram I called the list EV_list but in the code it is return_y — apologies for the inconsistency).

Ok! So now we add the remaining 29 ending values for the remaining 29 years of investment to return_y. The if statement was created to account for the $10,000 annual_inv that is only added in advance thus the final ending value at return_y[29] will not have it added.

It is important to note that the rand_return_30 is added inside the for loop so that when we do 1000 simulations next there is a new array of 30 returns created every time otherwise we will get the exact same values at the end of year 30, 1000 times.

In the above for loop for 1000 simulations, I embedded the previous for loop for the 30 year return period. The outer for loop will run this 1000 times and populate a new simulation list called “output” with the ending value of each full run at year 30 which is the value at return_y[29] #line11.

The last step at #line12 is to calculate the mean/average value of all 1000 simulations of the final return at the end of year 30. This will give us a more accurate prediction of a possible future value figure given the variability in returns and this accuracy increases as the number of simulations increase.

Final thoughts…

So far in my journey of learning how to become more technical and how to think about data/programming, I have realized that I only learn when I am faced with problems that don’t come with a set of instructions. It can be frustrating when you don’t know which way is up but once you learn how to structure your thinking and break apart a problem into simpler tasks it becomes easier and you will never forget it. I encourage you to make mistakes, fail and feel silly 😊… I do it all the time! Happy coding!

--

--