Motion Detection Techniques (With Code on OpenCV)

Safa Abbes
4 min readJun 23, 2022

--

For the past few years, the technology of motion detection has become one of the important research areas in computer vision. Many approaches have been invented on video sequences, some better than others. In this article we’ll explain and implement on OpenCV some of the basic approaches. The original code of the following snippets was developed by Dr. Nicola Garau from the University of Trento and the presented information was acquired during the course of Computer Vision instructed by Professor Nicola Conci in the University of Trento.

‘Red Light, Green Light’ Game from Netflix’s TV Show “Squid Game” that’s based on Motion Detection

1. Frame Differencing

The idea behind frame differencing is pretty simple. We check the difference between two video frames pixel by pixel. if there was a movement, there will be a change in the pixel values and so we will have our motion map. Simple Right? However, some pixel value changes can occur due to noise (change of illumination for example), to avoid capturing noise in our motion mask, we apply a threshold that will basically highlight big changes in term of intensity and discard the small ones. Note that there is no right choice for the threshold value and it’s usually done empirically.

Now that we understood the concept let’s showcase some code:

Frame Differencing Code

This method is computationally performant however it suffers from two main drawbacks: foreground aperture and ghosting caused by the frame rate and the object speed. One solution developped by Kameda and Minoh is the double difference where we operate a thresholded difference between two frame at time t and t-1 and between t-1 and t-2 then combine them with a logical AND to ensure that we always detect one object and not its ghost. Another issue with frame differencing is that when the object stops moving it’s not detected anymore. This obviously depends on the task we want to acheive but let’s say we want to keep on detecting the moving object even when it stops for a while? One answer to this issue is the background substraction technique.

2. Background Substraction

Background Subtraction is a widely used approach to detect moving objects in a sequence of frames from static cameras. It requires a reference image to play the background (In general acquired without objects). We then compute the difference between the current frame and the background frame(Reference Image). The main task of this is to detect the foreground which usually represent the moving objects.

Background Substraction Code

This method acquire good results if the color of objects differ from the background frame. However, like frame differencing, it has some major drawbacks. Needless to say that it is highly sensitive to illumination changes and camera motion, it also has the issue called waking person which means that if a background object moves (an object that belongs the reference image), both the real object and its ghost are detected. In this case, we get the opposite problem met in frame differencing: “What if we want to stop detecting a foreground object and absord it in the background?”

3. Adaptive Background Substraction

This approach basically combines the two previous techniques to get the best out of the two by introducing a learning rate λ. At each timestep, we weight the contribution of the incoming image and the previous background to construct a new background. For example, if we set λ=0.1, it would take 10 frames before updating the background frame (in other terms, the foreground object will be absorbed into the background). While for λ=0.5, we have a faster update (it will only take 2 frames before the update). Note that there is no rule to choosing λ, it’s done empirically since it depends on the task and environment we’re dealing with.

Adaptive Background Substraction Code

4. Mixture of Gaussians (MoG)

Mixture of Gaussians is a widely used approach for background modeling to detect moving objects from static cameras. Briefly explained, this approach starts by modeling each pixel as a sum of weighted gaussians where the weight defines the contribution of each gaussian. The intuition behind having multiple gaussians instead of one is that one pixel can represent many objects (Snow flakes and a building behind for example). By computing the color histogram using previous frames, we can have an idea which could be a background or foreground object. For instance, when we get a gaussian with a large evidence (weight) and a small standard deviation, it implies that the object described appears frequently and doesn’t change between the frames and so it’s probably part of the background. and that’s how the algorithm works; each incoming pixel is checked against the available models. In case of match, we update the weight, mean and standard deviation of our model and if the weight divided by the standard deviation is large, we classify the pixel as background otherwise as a foreground.

Mixture of Gaussians Code

You can find the complete code on my Github Repository.

--

--