Part 2 — The Math Behind Optical Flow

Part 2 of GPS Denied Navigation

Arush

Published in

Software for Autonomous Aerospace

3 min readMay 17, 2020

This is Part 2 in a series on building software for GPS Denied Navigation in advanced aerospace robotics.

Part 1 — Visual Feature Detection for Autonomous Vehicle Video Streams
Part 2 — The Math Behind Optical Flow
Part 3 — Lucas-Kanade Optical Flow

In Part 1 of my series in GPS Denied Navigation I showed some simple code on how to detect features from video using opencv.

Now that we can recognize features between image sequences, we need to know how to use them.

Optical flow is the apparent motion of objects, surfaces or edges based on the relative motion of the camera. There is actual math behind this. Let’s take a look at how we track a single pixel first:

The diagram below shows a pixel that has moved from (x,y)(x,y) at time tt to (x+u, y+v)(x+u,y+v) at time t+1t+1.

Pixel Correspondence Problem

From this picture, it’s easy to figure out the velocity vector (u,v). But when we look at two real images, we’d first need to solve what’s called the pixel correspondence problem. That is, we need to know which pixels in image 2 correspond to which pixels in image 1.

To solve this problem we make two assumptions.

Assumption 1: the motion is small: this means we can look in the vicinity of where the pixel was to try to determine where it now is.

Assumption 2: the appearance doesn’t change from t to t+1: this assumption is best expressed mathematically.

This relationship is known as the brightness constancy constraint.

If we drop the time index and do a Taylor series expansion of the right hand side of this equation, we find the following:

We can plug this back into the brightness constancy constraint and reorganize some terms to find the following:

It is common to use a subscript to denote derivatives, so this is how it looks with simplified notation:

This equation says that any change in the appearance of a pixel over time has to be explained by spatial motion induced by the camera movement.

Note: I(x,y,t) gives the intensity of the pixel at location (x,y) at time t. If this is a gray-scale image, this intensity would be a single number corresponding to the darkness of the pixel. If this were a color image it might be a 3-vector with values giving the amount of red, green, and blue in the image.

In Part 3 I’ll explain the difficulty of solving this equation and what we can do to estimate a solution.

Part 2 — The Math Behind Optical Flow

Part 2 of GPS Denied Navigation

Pixel Correspondence Problem

Written by Arush