Estimating the Parameters for a Geometric Brownian Motion Stochastic Process using Two Different Approaches

13 min readJan 17, 2024

The stochastic process called Geometric Brownian Motion (aka random walk) is the most common and prevalently used process due to its simplicity and wide-ranging applications.

In this article I will show how to estimate the parameters for the Geometric Brownian Motion Process using two different approaches.

The first approach is to fit the stochastic process (i.e., the Geometric Brownian Motion model) to a set of historical data. The second approach is simply to compute the expected returns and the volatility of the historical data. In the first approach we need to make assumptions about the initial values of the parameters.

1. Stochastic Process

A stochastic process is nothing but a mathematically defined equation that can create a series of outcomes over time, outcomes that are not deterministic in nature; that is, an equation or process that does not follow any simple discernible rule such as price will increase X percent every year or revenues will increase by this factor of X plus Y percent.

A stochastic process is by definition nondeterministic, and one can plug numbers into a stochastic process equation and obtain different results every time. For instance, the path of a stock price is stochastic in nature, and one cannot reliably predict the exact stock price path with any certainty.

However, the price evolution over time is enveloped in a process that generates these prices. The process is fixed and predetermined, but the outcomes are not.

Hence, by stochastic simulation, we create multiple pathways of prices, obtain a statistical sampling of these simulations, and make inferences on the potential pathways that the actual price may undertake given the nature and parameters of the stochastic process used to generate the time series.

2. Markov Processes

In efficient markets, financial prices should display a random walk pattern. More precisely, prices are assumed to follow a Markov process, which is a particular stochastic process independent of its history — the entire distribution of the future price relies on the current price only. The past is irrelevant.

These processes are built from the following components, described in order of increasing complexity.

2.1. The Wiener Process

The Wiener process describes a variable Δz, whose change is measured over the interval t such that its mean change is zero and variance is proportional to Δt:

If is a standard normal variable N(0, 1), this can be written as

In addition, the increments z are independent across time.

2.2. The Generalized Wiener Process

The generalized Wiener process describes a variable Δx built up from a Wiener process, with in addition a constant trend a per unit time and volatility b:

A particular case is the martingale, which is a zero-drift stochastic process, a = 0, which leads to E(Δx) = 0. This has the convenient property that the expectation of a future value is the current value

2.3. The Ito Process

The Ito process describes a generalized Wiener process whose trend and volatility depend on the current value of the underlying variable and time:

This is a Markov process because the distribution depends only on the current value of the random variable x, as well as time. In addition, the innovation in this process has a normal distribution.

3. Geometric Brownian Motion (GBM)

A particular example of Ito process is the geometric Brownian motion (GBM), which is described for the variable S as

The process is geometric because the trend and volatility terms are proportional to the current value of S. This is typically the case for stock prices, for which rates of returns appear to be more stationary than raw dollar returns, S. It is also used for currencies.

Because ΔS/S represents the capital appreciation only, abstracting from dividend payments, μ is the annualized growth or drift rate which represents the expected total rate of return on the asset minus the rate of income payment, or dividend yield in the case of stocks. σ is the annualized volatility or shock rate which represents the standard deviation of all the returns on the asset.

This model is particularly important because it is the underlying process for the Black-Scholes formula. The key feature of this distribution is the fact that the volatility is proportional to S.

This ensures that the stock price will stay positive. Indeed, as the stock price falls, its variance decreases, which makes it unlikely to experience a large down move that would push the price into negative values.

As the limit of this model is a normal distribution for

S follows a lognormal distribution. This process implies that, over an interval T − t = τ, the logarithm of the ending price is distributed as

where ϵ is a standardized normal variable.

Whether a lognormal distribution is much better than the normal distribution depends on the horizon considered. If the horizon is one day only, the choice of the lognormal versus normal assumption does not really matter.

It is highly unlikely that the stock price would drop below zero in one day, given typical volatilities. On the other hand, if the horizon is measured in years, the two assumptions do lead to different results.

The lognormal distribution is more realistic, as it prevents prices from turning negative. In simulations, this process is approximated by small steps with a normal distribution with mean and variance given by

With just one step n = 1, the distribution must be normal. As the number of steps n grows large, the distribution tends to a lognormal distribution.

While very useful to model stock prices, this model has shortcomings. Price increments are assumed to have a normal distribution. In practice, we observe that price changes have fatter tails than the normal distribution.

Returns may also experience changing variances. In addition, as the time interval Δt shrinks, the volatility shrinks as well. This implies that large discontinuities cannot occur over short intervals.

In reality, some assets experience discrete jumps, such as commodities. The stochastic process, therefore, may have to be changed to accommodate these observations.

3.1. Example for A Stock Price Follows Geometric Brownian Motion Process

Consider a stock that pays no dividends, has an expected return of 10% per annum, and volatility of 20% per annum. If the current price is $100, what is the process for the change in the stock price over the next week? What if the current price is $10?

The process for the stock price is

where is a random draw from a standard normal distribution. If the interval is one week, or Δt = 1/52 = 0.01923.

The trend is μΔt = 0.10 × 0.01923 = 0.001923 and the volatility is σ√Δt = 0.20 × √0.01923 = 0.027735.

The Geometric Brownian Motion process is S = $100(0.001923 + 0.027735× ϵ)

With an initial stock price at $100, this gives S = 0.1923 + 2.7735.

With an initial stock price at $10, this gives S = 0.01923 + 0.27735. The trend and volatility are scaled down by a factor of ten.

Now, let’s assume the price in one week is given by S = $100exp(R), where R has annual expected value of 10% and volatility of 20%. Construct a 95% confidence interval for S.

The standard normal deviates that corresponds to a 95% confidence interval are α_MIN = −1.96 and α_MAX = 1.96. In other words, we have 2.5% in each tail.

The lower bound of the 95% confidence band for R is called R_MIN:

R_MIN = μΔt − 1.96σ√Δt = 0.001923 − 1.96 × 0.027735 = −0.0524

The upper bound of the 95% confidence band for R is called R_MAX:

R_MAX = μΔt + 1.96σ√Δt = 0.001923 + 1.96 × 0.027735 = 0.0563.

This gives S_MIN = $100exp(−0.0524) = $94.89, and S_MAX = $100exp(0.0563) = $105.79.

4. The MLE Approach for Estimating the GBM Process Parameters

Maximum Likelihood Estimation, simply known as MLE, is a traditional probabilistic approach that can be applied to data belonging to any distribution, i.e., Normal, Poisson, Bernoulli, etc. With prior assumption or knowledge about the data distribution, Maximum Likelihood Estimation helps find the most likely-to-occur distribution parameters.

For instance, let us say we have data that is assumed to be normally distributed, but we do not know its mean and standard deviation parameters. Maximum Likelihood Estimation iteratively searches the most likely mean and standard deviation that could have generated the distribution.

Moreover, Maximum Likelihood Estimation can be applied to both regression and classification problems. Therefore, Maximum Likelihood Estimation is simply an optimization algorithm that searches for the most suitable parameters.

Since we know the data distribution a priori, the algorithm attempts iteratively to find its pattern. The approach is much generalized, so that it is important to devise a user-defined Python function that solves the particular machine learning problem.

4.1. How does Maximum Likelihood Estimation work?

The term likelihood can be defined as the possibility that the parameters under consideration may generate the data. A likelihood function is simply the joint probability function of the data distribution.

A maximum likelihood function is the optimized likelihood function employed with most-likely parameters. Function maximization is performed by differentiating the likelihood function with respect to the distribution parameters and set individually to zero.

If we look back into the basics of probability, we can understand that the joint probability function is simply the product of the probability functions of individual data points. With a large dataset, it is practically difficult to formulate a joint probability function and differentiate it with respect to the parameters.

Hence MLE introduces logarithmic likelihood functions. Maximizing a strictly increasing function is the same as maximizing its logarithmic form.

The parameters obtained via either likelihood function or log-likelihood function are the same. The logarithmic form enables the large product function to be converted into a summation function.

It is quite easy to sum the individual likelihood functions and differentiate it. Because of this simplicity in math works, Maximum Likelihood Estimation solves huge datasets with data points in the order of millions!

For each problem, the users are required to formulate the model and distribution function to arrive at the log-likelihood function. The optimization is performed using the SciPy library’s ‘optimize’ module.

The module has a method called ‘minimize’ that can minimize any input function with respect to an input parameter. In our case, the MLE looks for maximizing the log-likelihood function.

Therefore, we supply the negative log likelihood as the input function to the ‘minimize’ method. It differentiates the user-defined negative log-likelihood function with respect to each input parameter and arrives at the optimal parameters iteratively.

The parameters that are found through the MLE approach are called maximum likelihood estimates. In the sequel, we discuss the Python implementation of Maximum Likelihood Estimation with an example.

4.2. Fitting the Model to a Set of Historical Data

Our estimation date is December 31, 2023. In order to predict The Israel Inflation Rate for 2024, I chose the consumer-price-index (CPI) dataset that was sourced from Bank of Israel. This dataset was based on the consumer-price-index — general between December 31, 2018 and December 31, 2023.

A 5 years’ time-window is usually a normative, representative and accepted time-window among financial actuaries.

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from numpy import random as rn
from scipy import stats
import scipy.stats as si
import seaborn as sns
from scipy.stats import norm

import pandas as pd
cpi = pd.read_csv('CPI_Total.csv')
cpi.head()

cpi.tail()

Let’s choose initial values for μ, the annualized growth or drift rate, and σ the annualized volatility or shock rate

μ = a = 0.015
σ = b = 0.001

Let’s rename our CPI column

df = cpi.copy()
df = df.rename(columns={"CPI": "S"})
df

Now, let’s create a new column for ΔS

df["S(t+1)-S(t)"] = 0.00
for i in range(0,len(df["DATE"])-1):
    df["S(t+1)-S(t)"][i] = df["S"][i+1]-df["S"][i]
df

Now, let’s create a new column for μS

df["μS(t)"] = 0.00
for i in range(0,len(df["DATE"])-1):
    df["μS(t)"][i] = μ*df["S"][i]
df

Now, let’s create a new column for ΔS - μS

df["S(t+1)-S(t)-μS(t)"] = 0.00
for i in range(0,len(df["DATE"])-1):
    df["S(t+1)-S(t)-μS(t)"][i] = df["S(t+1)-S(t)"][i]-df["μS(t)"][i]
df

Let’s create a function called NORMDIST which returns the normal probability density function (aka “the probability mass function”) for the specified mean and standard deviation. This function has a very wide range of applications in statistics, including hypothesis testing.

def NORMDIST(x):
    z = si.norm.pdf(x,mu,sigma)
    return (z)

Now, let’s create a new column for the likelihood function

df["pdf"] = 0.00
for i in range(0,len(df["DATE"])-1):
    mu = 0.00
    sigma = σ*df["S"][i]
    x = df["S(t+1)-S(t)-μS(t)"][i]
    df["pdf"][i] = si.norm.pdf(x,mu,sigma)
df

Now, let’s create a new column for the log-likelihood function

df["ln(pdf)"] = 0.00
for i in range(0,len(df["DATE"])-1):
    df["ln(pdf)"][i] = np.log(df["pdf"][i])
df

Now, let’s sum up the log-likelihood column

df["ln(pdf)"].sum()

-5498.5108982847005

Now, let’s find the μ and σ which maximize the sum of all the log-likelihood values

from scipy.optimize import fmin

def ImpliedBrownianMotion(c):
    df = cpi.copy()
    df = df.rename(columns={"CPI": "S"})
    df["S(t+1)-S(t)"] = 0.00
    df["μS(t)"] = 0.00
    df["S(t+1)-S(t)-μS(t)"] = 0.00
    df["pdf"] = 0.00
    df["ln(pdf)"] = 0.00
    for i in range(0,len(df["DATE"])-1):
        df["S(t+1)-S(t)"][i] = df["S"][i+1]-df["S"][i]
        df["μS(t)"][i] = c[0]*df["S"][i]
        df["S(t+1)-S(t)-μS(t)"][i] = df["S(t+1)-S(t)"][i]-df["μS(t)"][i]
        mu = 0.00
        sigma = c[1]*df["S"][i]
        x = df["S(t+1)-S(t)-μS(t)"][i]
        df["pdf"][i] = si.norm.pdf(x,mu,sigma)
        df["ln(pdf)"][i] = np.log(df["pdf"][i])
    f1 = df["ln(pdf)"].sum()
    val = -f1
    print("[μ, σ]=",c,", Object Function Value:", val)
    return(val)
    
c = fmin(ImpliedBrownianMotion, [0.015,0.001])

μ = c[0]
σ = c[1]
print("The monthly drift rate obtained via the MLE approach is:", "{:.4%}".format(μ))
print("The monthly volatility rate obtained via the MLE approach is:", "{:.4%}".format(σ))

The monthly drift rate obtained via the MLE approach is: 0.1792%
The monthly volatility rate obtained via the MLE approach is: 0.3420%

5. The Financial Approach for Estimating the GBM Process Parameters

A 5 years’ time-window is usually a normative, representative and accepted time-window among financial actuaries.

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
from numpy import random as rn
from scipy import stats
import scipy.stats as si
import seaborn as sns
from scipy.stats import norm

import pandas as pd
cpi = pd.read_csv('CPI_Total.csv')
cpi.head()

cpi.tail()

Let's rename our CPI column


df1 = cpi.copy()
df1 = df1.rename(columns={"CPI": "Price"})
df1

Now, let’s create a new column for the relative returns

df1["Return"] = 0.00
for i in range(0,len(df1["DATE"])-1):
    df1["Return"][i] = df1["Price"][i+1]/df1["Price"][i]-1
df1

To estimate the μ parameter from a set of time-series data, the drift rate and can be found by setting μ to be the average of the relative returns

mu = df1["Return"].mean()
print("The monthly drift rate as the average of all the relative returns is:", "{:.4%}".format(mu))

The monthly drift rate as the average of all the relative returns is: 0.1764%

To estimate the σ parameter from a set of time-series data, the drift rate and can be found by setting σ to be the standard deviation of all the relative returns

sigma = df1["Return"].std()
print("The monthly volatility rate as the standard deviation of all the relative returns is:", "{:.4%}".format(sigma))

The monthly volatility rate is the standard deviation of all the relative returns is: 0.3424%

6. Conclusion

models = ['MLE','Finance','Initial Values']
drift_rate = [μ, mu, a]
volatility_rate = [σ, sigma, b]
compare_models = pd.DataFrame({ "Approaches": models, "drift rate": drift_rate, "volatility rate": volatility_rate })
compare_models

As I showed in this article there are several ways to estimate the parameters for the GBM Process. In order to estimate these parameters one can fit the model to a set of historical data, alternatively compute the expected returns and the volatility of the historical data or rather make assumptions about these values.

About the Author

Actuary Roi Polanitzer is the owner and chief valuator of Intrinsic Value. He is one of the three leading experts in Israel in the areas of quantitative finance, option valuation, risk management and financial engineering. He has consulted for accounting firms, financial advisory firms, investigative auditing firms and publicly-traded and privately-held companies in Israel on risk analysis, valuation, and real options, and has written numerous papers and articles on those topics. He is the founder and Chairman of the Israel Association of Financial Valuators and Actuaries. Actuary Polanitzer is also the founder and CEO of the Professional Data Scientists’ Israel Association and the owner and chief data scientist of Prediction consultants. He has an MBA in business administration and BA in economics. He is a certified Financial Risk Manager (FRM), a certified Chartered Risk Manager (CRM), a certified Quantitative Finance Valuator (QFV), a certified Financial and Economic Modeler (FEM), a certified Market Risk Actuary (MRA), a certified Credit Risk Actuary (CRA), a certified Python Data Analyst (PDA), and a certified Professional Data Scientist (PDS).