Estimating Pi with Monte Carlo in Excel

Freshman geometry class finally put to use

Published in

The Startup

8 min readJul 1, 2019

Art created with Pi by color coding each digit. Originally created by Martin Krzywinski.

While I was watching the show Veep, Kent Davison, the pollster and numbers-cruncher for Selina’s campaign mentioned his favorite number was Euler’s number which made me wonder how mathematicians came up with these numbers that are used in everyday life. I started reading about the history of these important numbers and then stumbled upon the history of Pi. After reading about the long Wikipedia page, I believe Pi is one of the most fascinating number in mathematics; there are races on how many digits people can memorize of Pi, there are races to see how many digits of Pi can be computed, there is a holiday after Pi, and yet Pi’s accuracy is not really relevant to its everyday use. There are strong diminishing returns when adding additional digits to Pi, 3.14 is only .5% off from the true value of Pi. NASA scientists used Pi rounded to 16 digits for its calculations when developing the Space Integrated Global Positioning System/Inertial Navigation System (SIGI) . All of these things I read about made me think about how someone could estimate Pi without being a mathematician or with a crazy computer.

Introducing Monte Carlo

Monte Carlo is used to solve problems by simulating randomness. Monte Carlo simulations are often used for gambling to estimate probabilities. The navy will use Monte Carlo simulations to identify the most probable areas someone would be located during search and rescue missions. I have used Monte Carlo simulations to simulate S&P performance and would conduct random walks with an investment portfolio. To put it simply Monte Carlo is basically just simulated guessing and checking using randomness.

How to estimate Pi with Monte Carlo

Estimating Pi is a common example cited for Monte Carlo simulations. The best part is you only need to know some basic geometry to estimate Pi. The idea is that an area of a circle is:

So, if the radius is 1 inch, then the area of the circle is just π.

If the radius is 1 inch, then the diameter of the circle must be 2 inches. Okay, that’s great and all but how do you actually derive Pi if you only know the area of a circle is Pi? Well, since the diameter of the circle is 2 inches, we can also put the circle inside a 2-inch square. The circle will be touching the square’s sides but there will be a part of the square that does not overlap with the circle.

Here’s the underlying logic for deriving Pi.

Since the area of the square is 4 inches and the area of the circle is Pi, the ratio of the areas will be Pi / 4.

Assume we put a 2-inch square and a circle with a radius of 1 inch on a plot where the centers of the shapes are located at x = 0 and y = 0 so the coordinate is (0,0).

If we plot random points inside the square, count how many points are in the circle, and how many points are in the square, then the ratio should be Pi / 4. So, if we multiply the ratio by 4 then we will have Pi.

But, how will we know which points will be in the circle and which will be in the square?

Easy, the distance of each point from the origin is less than or equal to one. The points inside the circle can be found by using the Pythagorean theorem.

C squared is the longest side of the triangle and is also the radius of a circle (which is 1 inch).

So as long as the sum of x and y squared is less than or equal to 1, it must be inside the circle.

The hypotenuse is 1 inch long. Note how the sum of x and y squared is less than or equal to 1 inside the circle.

Monte Carlo Simulations in Excel

You don’t need fancy software to run Monte Carlo Simulations. Although nowadays I use a lot of R, I still believe Excel is the best “statistical” software due to its popularity and ease of use. You can really push Excel to do more than just pivot tables.

Creating the random variables and underlying logic

First create two columns and label them x and y

Now to simulate randomness use the rand function by typing =RANDBETWEEN(-1000000,1000000)/1000000 in both columns. This is to simulate random numbers between -1 and 1. The reason why I am using one million instead of one is to add additional decimals places to the number. Highlight the two functions and click the square located in the bottom right of the cells and drag down to the 1001 row. This will copy the formula 999 more times.

Go to the top of the worksheet to column C and click on cell C2. Type in =IF( (POWER(A2,2) + POWER(B2,2)) <=1,1,0) . Double click the bottom right corner to fill in the remaining rows. This is the underlying logic to find if the point is in the circle. The formula squares x and y and checks if it is less than or equal to 1. If the point is inside the circle than, it outputs a 1, if the point is outside of the circle, it outputs a 0.

Conducting the Monte Carlo Simulation

Type 1 and 2 in cells D2 and D3 respectively and highlight the cells. Double click on the bottom right of the highlighted cells to fill in the remainder of the cells. This creates the amount of iterations the Monte Carlo simulation will conduct. After that, click on cell C1 and type =4 *SUM(C2:C1001) / 1000 . Look familiar? This is the formula to estimate Pi! The sum is counting the points that fall inside of the circle and the denominator is the amount of total points which is in the square. This is the ratio that is needed to calculate Pi and all we needed to do is multiply it by 4 to get Pi.

But it probably doesn’t resemble Pi.

And this is where data tables will help improve the accuracy. We only plotted 1,000 points which is a relatively small sample. We need to plot more, but we don’t have enough rows to plot 1,000,000 points or run the simulation 1,000,000 times. Creating a data table will help with this. Click on cell E2 and type =C1. This is to reference the cell and will be added to the records in the following iterations. Highlight D2 and E2 and press shift+control+down arrow to select the entire table. Go to the Data tab and click on What-If Analysis and then click on Data Table.

Type in F2 in the column input cell. This is a trick that you don’t really need to know why. (Because I don’t know why you do it, you just do)

Click OK and you will see the table fill with estimations of Pi. Click on E2 and type =AVERAGE(E2:E1001). This averages all of the estimations of Pi to create an average estimation which should closely resemble the actual number of Pi! You have successfully created the ratio with 1,000,000 data points and derived and estimation of Pi. Play around with it and see how close you can reach Pi. Change the sample size (number of rows in columns A and B) or add more rows to the data table to increase the amount of iterations.

The more iterations or samples you add, the more accurate Monte Carlo will be.

If you want to do this in R and also recreate the graphs above, I have attached the R code.

library(tidyverse)
library(ggthemes)#Create an empty data frame
piSimulation <- data.frame("")#Adds 1,000,000 random points between -1 and 1
piSimulation <- cbind(piSimulation, x =runif(1000000,-1,1))
piSimulation <- cbind(piSimulation, y =runif(1000000,-1,1))#Removes the extra column
piSimulation <-  piSimulation[,-1]#Logic for checking if points are inside the circle
piSimulation <-  piSimulation %>% mutate(., inCircle = (piSimulation$x^2 + piSimulation$y^2) <=1)#Estimation of Pi
4 *sum(piSimulation$inCircle, na.rm = TRUE) / 1000000#Creates row reference to serve as a point count for graph
piSimulation$ID <- seq.int(nrow(piSimulation))#Graphs the points and filters the data to the first 10,000 points
ggplot(data = piSimulation %>% filter(., ID < 10000), mapping = aes( x, y, color = inCircle)) + geom_point() + theme_economist()

As you can see, creating it in R takes less than 10 lines of code.

Finishing thoughts

Monte Carlo is a fun tool to use and also has useful applications regardless of what field you are in. You may be asking why I didn’t just run the simulation for the first quadrant and used the rand() function in which x and y would always be positive. I think it is easier conceptually to understand however, plotting only in the first quadrant is easier to implement. There is something beautiful about the Monte Carlo method. I think it’s due to the ability of using randomness and chaos to find a solution that otherwise could not be found. Or maybe I just really want to live in Monaco.