Motion Detection Basics

Being in my final year of engineering I have a major project where me and my friends are making a self-driving car which makes use of various machine learning and computer vision techniques to stay on track. This post is based on my findings till now while detecting obstacles/objects nearby the car.


To detect anything, we first need a video feed. But a video feed cannot be processed directly, it needs to be converted into small frames or images and then it can be processed. OpenCV library provides all the underlying complex algorithms needed for image processing. It also supports python. To grab a video feed using OpenCV —

#Grab Video
camera =cv2.VideoCapture(0)
#Read Frames
while True:
(grabbed,frame) = returns two tuples, which are stored in (grabbed,frame)

Now that we have got a video feed its time to make a computer figure out what is inside the image. There are many ways it can figure out in but the way it analyses image or a video is same - by converting image or frame in array of numbers and then processing it as shown in image on the left. This is something known as image feature vector. Feature here means the characteristic we are targeting for an image to convert it into numbers, it maybe colour, shape or even texture.

To detect motion at the most basic level, we can do one simple thing: compare initial scene with every other scene. This can be used for surveillance at places like warehouses to detect if there is some activity or not. Here we assume that initial frame or scene has no motion. Its important to note that even the consecutive frames can’t be exactly same and can knock off this algorithm in comparing initial frame with consecutive frames, to tackle with this problem we Gaussian blurred each and every frame by dividing individual frame in small squares, say 20 x 20 px. Also, we are grayscaling image as it makes no sense in processing colours while detecting motion.

gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) #grayscale
gray = cv2.GaussianBlur(gray, (20, 20), 0)

Lets start comparing initial frame with consecutive frames now. OpenCV provides absdiff() to do this. Also, we’ll threshold frameDelta to reveal only regions of the image that only have significant changes in pixel intensity values. If the delta is less than 25, we discard the pixel and set it to black (i.e. background). If the delta is greater than 25, we’ll set it to white (i.e. foreground). The function used is cv2.threshold. First argument is the source image, which should be a grayscale image. Second argument is the threshold value which is used to classify the pixel values. Third argument is the maxVal which represents the value to be given if pixel value is more than (sometimes less than) the threshold value. OpenCV provides different styles of thresholding and it is decided by the fourth parameter of the function. Different types are:

frameDelta = cv2.absdiff(firstFrame, gray)
thresh = cv2.threshold(frameDelta, 25, 255, cv2.THRESH_BINARY)[1]

Done! We need to release camera in the end from OpenCV after all operations.

Since this post was based on my very first try on motion detection, don’t expect it to be accurate or somewhat closer to accurate. Even changes in lighting conditions can act as false positive motion detections. There has to be more serious way in which background is detected and how objects are separated from it.

I’ll try to cover aforementioned way of improving existing algorithm in future post, hope you’ve enjoyed reading! Let me know your thoughts in comments. 
Support my posts by clicking ❤ button below! :)

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.