Improving Monte Carlo using less “Randomness”

Helio Inaba
Technological Singularity
7 min readApr 14, 2024
Photo by Joe Maldonado on Unsplash

The Monte Carlo Method is a computational technique often used to solve complex problems involving variables with embedded uncertainty, embracing this uncertainty and sampling the problem repeatedly in different scenarios.

The method consists of evaluating a function multiple times, sampling over its domain, and finding its Expected Value. This is particularly important due to the Feynman-Kac Formula, which links parabolic differential equations, a set of problems that are often found in physics and finance that are known for being very hard to solve analytically, to an Expected Value problem, which can be solved by Monte Carlo.

Sampling over a domain composed of random variables must consider the probability distribution of the underlying variables over the domain. This can get very complex pretty fast if we consider multiple variables and/or conditions to the computation, impeding an analytical solution. That is the reason that we use randomly generated numbers to simulate a set of inputs that follows the given distributions and conditions.

Curiously, despite using randomness at its core, the method does not imply that randomness is necessarily involved. This may sound counterintuitive, but by being a numerical approximation, the output is expected to get closer and closer to a very specific value that is a function of some sort of randomness but is nevertheless unique (e.g., Black-Scholes pricing).

The process of generating random numbers is not trivial. Any random number or sequence of numbers generated by a computer is not, in fact, random. Since it comes from an algorithm, it will generate the same output deterministically when given the same initial input. Because of that, we call them pseudo-random numbers. These numbers appear sufficiently random to us, and can be used in many applications (e.g., drawing the winning raffle ticket).

There are many algorithms classified as Pseudorandom Number Generators (PRNG). Although all of them can generate numbers in patterns that neither a human nor any other computer can predict, some are better suited for Monte Carlo methods than others.

In this article, I will present three generators and compare their performances over a problem with a well-known solution.

The problem in question is approximating pi with random numbers. The idea is to generate N random pairs of numbers between 0 and 1 and plot them on the X and Y axes. Then, we count how many are located inside a circumference with a radius of 0.5 and centered in (0.5, 0.5).

The proportion of how many points follow this condition to the total number of points should approximate the ratio of the area of the circle and the square.

1. Regular numpy.random

The default numpy RNG, called the Mersenne twister, generates these points. Despite being random, we can clearly see some clusters and some empty spaces that shows up by pure chance this happens because the process of generating random points is memoryless; in other words, each generated point is totally independent of the previous points. This may sound counterintuitive, but the existence of clusters and empty areas are expected, in the same way that you would expect eventually getting a sequence of ten heads or ten tails if you flip a fair coin a million times.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import qmc

import pandas as pd
import seaborn as sns

import matplotlib.pyplot as plt
import numpy as np
from itertools import product, combinations

def is_inside(points, radius, center):

X = points[0]
Y = points[1]

x_center = center[0]
y_center = center[1]

array_is_inside = [np.sqrt((x - x_center)**2 + (y - y_center)**2) < radius for x,y in zip(X, Y)]

return array_is_inside

N = 2**14

X,Y = np.random.uniform(0,1,(2, N))
center = (.5,.5)
radius = .5

circle = plt.Circle(xy=center,
radius= radius,
fill=False)

fig, ax = plt.subplots(figsize = (10,10))

ax.add_patch(circle)
ax.scatter(x=X,
y=Y,
c=is_inside((X,Y), radius, center),
cmap='bwr',
s=.3)

ax.set_xlim(0, 1.0)
ax.set_ylim(0, 1.0)

plt.show()

2. Halton Sequences

Halton Sequences are not random at all. The method randomly selects two co-prime numbers and divides the X and Y axis multiple times into n and m parts, where n and m are the selected co-prime numbers. It is important for n and m to be co-primes, so the period of their divisions does not match, generating the illusion of randomness, as we can see below. Since it always divides both axes symmetrically, the points are more evenly distributed over the domain (the square).


def halton(n, d=1):
sampler = qmc.Halton(d, scramble=True)
return sampler.random(n)

X_halton, Y_halton = halton(n=N, d=2).T

circle = plt.Circle(xy=center,
radius= radius,
fill=False)

fig, ax = plt.subplots(figsize = (10,10))

ax.add_patch(circle)
ax.scatter(x=X_halton,
y = Y_halton,
c=is_inside((X_halton, Y_halton), radius, center),
cmap='bwr',
s=.3)

ax.set_xlim(0, 1.0)
ax.set_ylim(0, 1.0)

plt.show()

3. Sobol Sequences

Sobol Sequences also uses the idea of covering the domain more evenly by dividing each dimension by a power of 2. Thus, it can create low-discrepancy sequences, as we can see below.

def sobol(m, d=1):
sampler = qmc.Sobol(d, scramble=True)
return sampler.random_base2(m)

X_sobol, Y_sobol = sobol(m=int(np.log2(N)) ,d=2).T

circle = plt.Circle(center, radius, fill=False)

fig, ax = plt.subplots(figsize = (10,10))

ax.add_patch(circle)

ax.scatter(x=X_sobol,
y=Y_sobol,
c=is_inside((X_sobol, Y_sobol), radius, center),
cmap='bwr',
s=.3)

ax.set_xlim(0, 1.0)
ax.set_ylim(0, 1.0)

plt.show()

Comparing convergences

Now, we plot the running result of each method as a function of the number of points in log2 scale.

df_sobol = pd.Series(4*(np.cumsum(is_inside((X_sobol, Y_sobol), radius, center))/np.arange(1,N + 1)))
df_sobol = df_sobol.to_frame(name='Sobol')
df_sobol = df_sobol.iloc[32:]

df_regular = pd.Series(4*(np.cumsum(is_inside((X, Y), radius, center))/np.arange(1,N + 1)))
df_regular = df_regular.to_frame(name='Regular')
df_regular = df_regular.iloc[32:]

df_halton = pd.Series(4*(np.cumsum(is_inside((X_halton, Y_halton), radius, center))/np.arange(1,N + 1)))
df_halton = df_halton.to_frame(name='Halton')
df_halton = df_halton.iloc[32:]

fig, ax = plt.subplots(figsize=(17, 8))


plt.ylabel('Running Average Output', fontsize=11)
plt.xlabel('# of Points', fontsize=11)
plt.title('Convergence Comparison')

sns.lineplot(pd.concat([df_regular, df_sobol, df_halton],axis=1), ax=ax)
# #sns.lineplot(df_sobol, ax=ax, color = 'red')
plt.axhline(y=np.pi, color='black', linestyle='--')

ax.set_xscale('log', base=2)

All of them converge to the exact answer, but the low-discrepancy sequences (Halton and Sobol) converge faster. This means less computer resources are required to achieve the same level of accuracy. In contexts of more complex problems, this difference is even larger, making it essential to choose the right generator when performing a Monte Carlo method.

What about higher dimensions?

Now, we repeat the process but in a higher dimension. Sampling multiple points in the 3-D space and comparing the volume of a cube and an inscribed sphere, we have the following result:

# Calculating pi using 3d Halton Monte Carlo

X_halton, Y_halton, Z_halton = halton(n=N, d=3).T

X_halton = X_halton*2-1
Y_halton = Y_halton*2-1
Z_halton = Z_halton*2-1

df_halton_3d = pd.Series(6*(np.cumsum([np.sqrt(x**2 + y**2 + z**2) < 1 for x,y,z in zip(X_halton, Y_halton, Z_halton)])/np.arange(1,N + 1)))
df_halton_3d = df_halton_3d.to_frame(name='Halton')
df_halton_3d = df_halton_3d.iloc[32:]

# Calculating pi using 3d Sobol Monte Carlo

X_sobol, Y_sobol, Z_sobol = sobol(m=int(np.log2(N)) ,d=3).T

X_sobol = X_sobol*2-1
Y_sobol = Y_sobol*2-1
Z_sobol = Z_sobol*2-1

df_sobol_3d = pd.Series(6*(np.cumsum([np.sqrt(x**2 + y**2 + z**2) < 1 for x,y,z in zip(X_sobol, Y_sobol, Z_sobol)])/np.arange(1,N + 1)))
df_sobol_3d = df_sobol_3d.to_frame(name='sobol')
df_sobol_3d = df_sobol_3d.iloc[32:]

# Calculating pi using 3d Regular Monte Carlo

X, Y, Z = np.random.uniform(-1,1,(3, N))

df_regular_3d = pd.Series(6*(np.cumsum([np.sqrt(x**2 + y**2 + z**2) < 1 for x,y,z in zip(X, Y, Z)])/np.arange(1,N + 1)))
df_regular_3d = df_regular_3d.to_frame(name='Regular')
df_regular_3d = df_regular_3d.iloc[32:]


def plot_3D_Monte_Carlo(fig, pos, title, X,Y,Z):

ax = fig.add_subplot(pos ,projection='3d')

ax.set_xlim(-1.0, 1.0)
ax.set_ylim(-1.0, 1.0)
ax.set_zlim(-1.0, 1.0)

ax.set_xticklabels([])
ax.set_yticklabels([])
ax.set_zticklabels([])

ax.set_title(title)
ax.set_aspect("equal")

# draw cube
r = [-1, 1]
for s, e in combinations(np.array(list(product(r, r, r))), 2):
if np.sum(np.abs(s-e)) == r[1]-r[0]:
ax.plot3D(*zip(s, e), color="black")

# draw sphere
u, v = np.mgrid[0:2*np.pi:20j, 0:np.pi:20j]
x = np.cos(u)*np.sin(v)
y = np.sin(u)*np.sin(v)
z = np.cos(v)
ax.plot_wireframe(x, y, z, color="black",linewidths=.7)

# # draw a point
ax.scatter(X, Y, Z, c=['blue' if np.sqrt(x**2+y**2+z**2) > 1 else 'red' for x,y,z in zip(X,Y,Z)],
s=.1)

fig = plt.figure(figsize=(plt.figaspect(.23)))

plot_3D_Monte_Carlo(fig,131,'Regular',X,Y,Z)
plot_3D_Monte_Carlo(fig,132,'Halton',X_halton,Y_halton,Z_halton)
plot_3D_Monte_Carlo(fig,133,'Sobol',X_sobol,Y_sobol,Z_sobol)

plt.show()

This time, it is much harder to see any difference because we are trying to visualize a 3D object on a 2D screen, but we can compare the convergence:

fig, ax = plt.subplots(figsize=(17, 8))

plt.ylabel('Running Average Output', fontsize=11)
plt.xlabel('# of Points', fontsize=11)
plt.title('Convergence Comparison')

sns.lineplot(pd.concat([df_regular_3d, df_halton_3d, df_sobol_3d],axis=1), ax=ax)
#sns.lineplot(df_sobol_mc, ax=ax, color = 'red')
plt.axhline(y=np.pi, color='black', linestyle='--')


ax.set_xscale('log', base=2)

For all methods, it takes more iterations to achieve the same level of convergence. This indicates that the higher the dimension, the harder it is to achieve a satisfying result. We can also see that the difference in performance between regular sampling and low-discrepancy sequences is higher in higher dimensions.

In summary, carefully selecting they way we generate “random” numbers, can improve the performance of Monte Carlo, minimizing numerical error in our computations.

--

--