Day 55 of 100DaysofML

Published in

100DaysofMLcode

6 min readAug 13, 2020

Background Removal Project using OpenCV. So essentially this is a simple yet cool feature that is used very commonly in most video streaming services such as Zoom, Hangouts etc. It's a very simple analogy and I would like to mention how easily this can be done.

So what exactly does this project work on. So we have our background and we have our foreground. This project is gonna help us remove our background and replace it with any video or picture as it is done in video chatting/streaming apps.

Since, I have already discussed about the usage of OpenCV and its basics before, I am going to dive right into the code and its explanation.

So essentially, what we are going to do is use a BACKGROUND SUBTRACTION technique. So it essentially involves subtracting the background which is behind me and then subtracting the foreground which is in front of me. This is essentially the entire idea of the project and it will start making more sense once we start to implement it. So, let’s get straight to the implementation.

So the first step is going it be to import all the essential libraries needed for this and I would recommend using pip to install any of the libraries that you do not have installed.

#Importing libraries
import cv2
import numpy as np
import sys

The next step is going to be to switch on my camera and import the video that I am going to be using as my background. You can install any video and save it in the same folder as your notebook or just mention the path to the video file.

video = cv2.VideoCapture(1)
oceanVideo = cv2.VideoCapture("ocean.mp4")
success, ref_img = video.read()
flag = 0

The first line turns on your camera.
The second line opens the video file using opencv.
The third line is basically letting the camera capture the background and save it as reference. This is called ref_img.
The flag is used in order to control the program manually and replace the background with the image or video which you will understand in the coming lines.

The next important step is to run a while loop or an infinite loop until the Escape sequence is clicked whereby we read the input in from the video camera. The next step would be to read the input from the second video in the while loop.

success, img = video.read()
success2, bg = oceanVideo.read()

Now, we need to ensure that the background and the video are of the same size for which we need to resize the video and compare it with the reference image which has been taken at the start. So we use a function which I picked up from online but is very easy to understand, it is called resize.

#Resize function
def resize(dst,img):
  width = img.shape[1]
  height = img.shape[0]
  dim = (width, height)
  resized = cv2.resize(dst, dim, interpolation = cv2.INTER_AREA)
  return resized#Use inside while loop to resize video to reference background
bg = resize(bg,ref_img)

Conceptually, this is very important to understand that all images are stored as matrices and each of these matrices can be added or subtracted from each other only if they are of the same dimension. This is why, it is important to get them both to the same dimension.

One of the most important steps is carried out next whereby we create a mask.

We do this by creating a new image which is formed by subtracting the new image which is coming out of the webcam with the reference image which we had taken and stored. We are doing this both ways using the given lines of code since we need to get all the possible details if we are missing out on any.

#create a mask
diff1=cv2.subtract(img,ref_img)
diff2=cv2.subtract(ref_img,img)

The next important step would be to add these two differences and save them and then create a threshold. This is because there is so much noise that it could be very easy to lose the details. So we take the absolute value of the image matrice that we have gotten and then set a threshold for it. This threshold value is going to be different from person to person as we need to understand that the threshold is going to be based on the amount of noise that your camera creates and the clarity. Thus, I would recommend to play around this value to understand its functionality.

diff = diff1+diff2
diff[abs(diff)<13.0]=0

The next step is to convert the extracted image into grayscale using the inbuilt function of opencv called cvtColor. We then convert the absolute value of grey color to zero. What it does that is wherever there is a difference, it will become white color and wherever there is no difference, it will make it black color.

gray = cv2.cvtColor(diff.astype(np.uint8), cv2.COLOR_BGR2GRAY)
gray[np.abs(gray) < 10] = 0

This is the stage where we are done with the creation of our mask and will convert it back into a black and white stage using the following lines of code (inside the main while loop). Basically, all the values which are white and black need to be converted into a 255 value so that it can help us with the computation.

fgmask = gray.astype(np.uint8)
fgmask[fgmask>0]=255

The next step is to Invert the mask. You may ask why we are doing this.
Now, we are focussing on the foreground of the mask and on the background of the mask. So using that background masi, we are going to replace our original image/video stream. This will be done using the bitwise_not operator in opencv. This will only takeout what is needed from the foreground to the background. But the whole point of inverting is so that we can use it as our foreground.

#invert the mask
fgmask_inv = cv2.bitwise_not(fgmask)

NOTE: INCASE IT GOT A BIT CONFUSING REGARDING FOREGROUND AND BACKGROUND, just be patient and read through the entire code at the end of the blog and it should make more sense.

We are now going to use the masks to extract the relevant parts of our Foreground and Background. Using the below given lines of code, we are extracting the features that we need from our foreground as well as background in order to complete the given task.

#use the masks to extract the relevant parts from FG and BG
fgimg = cv2.bitwise_and(img,img,mask = fgmask)
bgimg = cv2.bitwise_and(bg,bg,mask = fgmask_inv)

Next is simple addition of these extracted features.

#combine both the BG and the FG images
dst = cv2.add(bgimg,fgimg)

The last and final step of our main while loop is to display the image using our cv2.imshow. Just copy the following piece of code at the end of the loop as it is.

cv2.imshow('Background Removal',dst)
key = cv2.waitKey(5) & 0xFF
if ord('q') == key:
    break
elif ord('d') == key:
    flag = 1
    print("Background Captured")
elif ord('r') == key:
    flag = 0
    print("Ready to Capture new Background")
cv2.destroyAllWindows()
video.release()

Below given are a few screenshots of the implementation of the above algorithm.

If you notice from the screenshots, there's a huge void, that’s because of my head being there which was identified as the background. I have attached the source code and link to my github below for the source code of the following project.

That’s it for today. Thanks for reading.

Cheers.

Day 55 of 100DaysofML

Conceptually, this is very important to understand that all images are stored as matrices and each of these matrices can be added or subtracted from each other only if they are of the same dimension. This is why, it is important to get them both to the same dimension.

Written by Charan Soneji