Kalman filter in stock trading

Dhanoop Karunakaran
Intro to Artificial Intelligence
5 min readSep 11, 2023
Kalman Filter. Source:[1]

What is Kalman Filter?

Given the measurements are subject to noise, the Kalman filter (KF) algorithm can recover the true state of the underlying object being tracked. The algorithm has two steps: the prediction step and the measurement update step. The filter combines the measurement from the noisy sensor and prediction from a physics-based model (for instance, velecity*time gives the distance) to provide the optimal estimation. Let’s try to estimate a car’s position using a GPS sensor. As we know, GPS sensors cannot accurately predict the position of the vehicle (which is critical for safety-critical applications such as self-driving cars), our aim is the predict the actual positions of the vehicle.

Car’s position at the previous timestep, k-1

The above figure shows the car position at the previous timestep, using the Gaussian distributions. The variance denotes how much accurate is in the estimate. In this case, the smaller the variance larger the accuracy.

In the prediction stage of the filter, it predicts the car position at current timesteps, k using the physics model. The below figure shows the new position of the car in another Gaussian distribution with a larger variance. It is due to the new position being computed using a simple physics model without any data from sensors. It means that the car’s position can be anywhere in that distribution.

Car’s positions at previous timesteps and prediction

During the measurement update stage, we get the car’s current position from the GPS sensor which is shown in the orange distribution in the below figure.

Car’s position at prediction and measurement update steps

As the GPS sensor is also noisy, we combine the result from the prediction and measurement from the sensor to estimate the optimal position as shown below. This is done by multiplying these two probability functions together and the result will also be another Gaussian function.

Combining two positions gives the optimal state estimate of the car’s position at time x_k

This way we can estimate the true state of the object being tracked even if the measurements from the sensors are noisy.

Twp steps in the Kalman filter where x is the mean and p is the variance. Source:[4]

The above figure shows the steps in the Kalman filter to estimate the true state of the object being tracked. In the update step, we use the predicted value from the prediction step and the measured value to find the error in the estimation. The Kalman gain gives the control of how much weight should be given to the predicted value and measured value. This parameter determines whether the true state is closer to the predicted value or measured value.

K = Error In Prediction / (Error in Prediction + Error in Measurement) [4]

Using these values, we can find the true state (x_k and P_k) in the update step. Please note there are other parameters that we use to apply KF, for simiplicity purposes, it’s not disclosed here.

KF in stock trading

Computation of Moving average in blue line with a window size of 50 days, Source:[5]

One of the use cases of the algorithm in stock trading is that we can use it to smooth pricing data similar to the Moving average (MA).

A moving average is a technical analysis tool that helps smooth out price data by creating a constantly updated average price. It is commonly used by traders to identify trends and potential trading opportunities.

One of the advantages of the KF over MA is that it eliminates the need for window length which reduces the overfitting concern.

Implementation in Python

I’ve implemented a sample code of the algorithm use case in stock trading and published it in the GitHub repo. Please have a look at it if you want to play with it. Here are the steps for implementing the algorithm.

  1. Import the data
# Load pricing data for a security
df = pd.read_csv('data/IFNNY.csv')
x = df['Adj Close']

2. Mainly, we can utilize the pykalman to implement the algorithm. The algorithm requires the default parameters to be initialized to estimate the true state. We have taken commonly used default values for these parameters.

# Construct a Kalman filter
kf = KalmanFilter(transition_matrices = [1],
observation_matrices = [1],
initial_state_mean = 0,
initial_state_covariance = 1,
observation_covariance=1,
transition_covariance=.0001)

3. Compute the rolling mean and covariance.

mean, cov = kf.filter(x.values)
mean = pd.Series(mean.flatten(), index=x.index)

4. To compare, we can compute moving averages with 30 and 60-day windows.

# Compute the rolling mean with various lookback windows
mean30 = x.rolling(window = 30).mean()
mean60 = x.rolling(window = 60).mean()

5. We can plot the rolling mean computed using the Kalman filter with 30 and 60-day moving averages.

As you can see Kalman's estimate is much smoother and not overfitted compared to moving average lines.

If you like my write-up, follow me on Github, Linkedin, and/or Medium profile.

Reference

  1. https://au.mathworks.com/videos/series/understanding-kalman-filters.html
  2. https://tng-daryl.medium.com/implementing-the-kalman-filter-on-stock-data-1dce3a192a93
  3. https://www.youtube.com/watch?v=YavO2-sNVcs
  4. https://towardsdatascience.com/kalman-filter-interview-bdc39f3e6cf3
  5. https://top10stockbroker.com/technical-analysis/moving-averages/
  6. https://github.com/pykalman/pykalman
  7. https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/ai-for-finance/solution/kalman_filters_solution.ipynb

--

--