Kalman Filter and Its Application in Marketing Analytics

6 min readJul 11, 2019

FOR INSPIRATIONAL PURPOSE ONLY

I first learned of Kalman Filter through my advanced statistics class taught by Professor Prasad Naik in the MSBA program at UC Davis. I came from a solid ten plus years in marketing analytics and personally experienced and applied advanced statistical techniques in my jobs. Through such type of techniques I contributed to the growth of Providian Financial whose stock value was ~$2.50/share when I first joined the company to ~$18/share when it was bought by Washington Mutual. This phenomenal growth happened within a short period of five years.

Through such experience, I became intrigued when Professor Naik taught us an algorithm that is used in missile position detection, in space traveling, and ship navigation. Here is the crude form of the algorithm for marketing analytics:

Figure 1

I am not sure why it is called an algorithm, but what is shown here is merely a system of equations denoting with variables for various metrics given a marketing campaign. Here, these variables are:

Figure 2

When Professor Naik said it, it somehow appeared pretty straight forward. Maybe he already laid a really solid series of lessons for us before leading to this lecture. (I am debating whether to talk more about how Professor Naik delivered this algorithm and delivered such that I jotted down my study guide what I think is important and not to be missed for our Comprehensive Exam or should I get right back to Kalman Filter.)

Figure 3

Let’s get right back to Kalman Filter again. So we were talking about Kalman Filter in marketing, but let’s take a quick look into Kalman Filter for space traveling/navigation in general. I have better explanations, notes, and demonstrations. If you look closely where I highlighted Kalman Filter in dark green, I captured what Professor Naik mentioned something along this line and exactly this, just in a bit different order:

Kalman Filter: estimates the unobserved states of a dynamic system, vector A. For each t, what is the best estimate of the true position, velocity, and acceleration? 1) Prediction based on dynamic model 2) Correction based on observed data. Kalman Gain Factor Balances model accuracy vs. data precision. (from Professor Prasad Naik lectures, Spring Quarter of 2019)

Basically, the highlighted text above is explained by these equations that are written several lines at a time, not the kindergarten algebra where we just have one line:

Dynamic system is defined as this:

Figure 4

I think P stands for position, v stands for velocity, a stands for acceleration and t stands for time t. I forgot what the last vector represent. I think they represent errors. D stands for drift I believe. (Not sure what it exactly is. I think the distance traveled given t.)

Using the system of equations above, we can predict the “true position, velocity, and acceleration.” This prediction, which derived mathematically is compared with observed data to “correct” in case the prediction is off or the measurement of the observed data is off. I know this is and can be a little confusing, but just know there are two things happening when it comes to deciding what to report. 1) Should we report using the answers given by the model or 2) should we report using the answers given by our measurements. And there is an equation for it. It is called Kalman Gain Factor. This equation helps correcting the numbers by putting “more weight on models if data are imprecise” and “more weight on the data if models are inaccurate.”

Figure 5

But how did this equation come about or how does it fit into the Dynamic system of equations above.

Kalman Gain Factor I guess comes from the variance calculation of the dynamic system of equations. (My interpretation based on the equation.)

Figure 6

The sigma symbol represent variance at time t given t. If we distribute the last line of the equation above, which is measuring variance at time t given time t,

Figure 7

And then we bring the second part to the other side of the equal side, and then divide what is not K, to set K alone on the left side and the right side you have this part of the equation as a denominator. Please note the -1 means it is a denominator written differently.

Figure 8

So what we have been seeing so far is just the variance part of Kalman Filter. To help us see the complete picture of Kalman Filter, here is the mean equation, or mu. Normally we talk about mu first and then the variance. We just tried to understand this concept from a different angle.

Figure 9

What the equation above says is the actual value at time t given time t equals to the estimated mean at time t measured/forecasted at time t-1, just the moment before t happens, plus the error difference of forecast means at time t, represent by y(t) minus y-hat(t), adjusted with Kalman Filter Gain factor. (Maybe this part needs more explanation.) But basically what the equation above says is how we derive and arrive the final number to report. And it is based on four components

1. u(t|t-1) ←mean calculated for time t given information at time t-1
2. K(t) ← the variance factor also known as Kalman Filter Gain Factor
3. y(t). ←observed mean at time t
4. y^(t). ←forecast of mean to expect at time t

Here is a graphical explanation of what these variables are:

Figure 10

At time zero, denoted by t0, we have u0. And then we use the dynamic system of equations, we can calculated what u will be at time one when we still at time t0. So the symbol is u1|0, and time is denoted by t1. So when time when t=1 arrives, we have the calculated mean at time t=1 when the actual time is t=1. Note that, y is the data point observed that contribute to the calculation of the mean and the variance.

I intended to take a good amount of time to write about this favorite topic of mine. However, my time was committed to my Practicum team to work on the presentation for a conference about electrical vehicles in Monterey. I hope to share this fascinating topic for inspirational purpose with you and at the same time mark my understand of this topic at this very moment in time.

The goal is to really inspire and not merely to display expertise or being an expert of this topic.

THANK YOU for reading my blog!

Bao Nguyen

MSBA at UC Davis, class of 2019

p.s.

I want to thank Professor Prasad Naik for sharing with us such fascinating topic and allowing the use of such invaluable excerpts as shown in Figure 1 to Figure 10 and this concept as I tried to convey in my very humble understanding. I only hope to mark my understanding as of today July 10, 2019 and hope to assess the understanding of this concept and materials again in a near future.

Written by Bao Nguyen