Invisibility-Harry’s Cloak using python(OpenCv)

9 min readJul 24, 2020

Few days ago I was making an animation for teaching complex analysis. In that animation I was using OpenCv to transform any image or live video according to Complex Analytic functions or Matrices. To do so, I first created an array of pixels and then move them around. Then the pixels with are lost inside a certain ranges were replaced using a mask.

Inversion is used on my beautiful face using OpenCv

This “Mask” make me think, Can we do something like mask a portion of live video and then replace it with a previously captured frame? Something like what Mr.India did or Something Harry Potter did?

It was obvious, there are many people who have already done it and as I am a lazy bum, Here I will just explain those codes already available and will make them a little small and compact.

The power to invisibility

With all this talk, Let’s begin out main goal.

Requirements:

You will need some python libraries for this.

numpy(This is the saviour of physics and maths programmers).
time(We don’t actually need this one. It is just used to give our machine a little rest).
OpenCv(Obviously, from the title you should have expected this 🙄)

The method of installing those libraries are linked, just click on them. Apart from those libraries , the most important things are a Homosapien, a Laptop, Internet(so you can upload the result).

Writing and understanding the code:

You can use any text editor like Sublime text or something like Visual Studio, Jupiter Notebook, Pycharm etc..

First we include all our necessary libraries

import numpy as np
import cv2
import time

Now after this, we begin the main programming by starting to record frames.

cap = cv2.VideoCapture(0)
fourcc = cv2.VideoWriter_fourcc(*'XVID')
output = cv2.VideoWriter('Invisible_man.avi', fourcc, 30, (640, 480))

In this block what we have done is:

First created a capture using the webcam. Live stream is captured using this function. There, the 0(zero — UltraMan) represent our primary webcam. If you have additional ones, just replace the zero with one to use those.
Then to save your craziness, we will use VideoWriter. There the XVID codec is used.
To specify your wished video-format and name , again we have used the VideoWriter(this is a function of cv2). The first one is the name of the output video file. The second one takes the codec(previously defined). Then comes the frame-rate. And finally our width and height(I mean video’s. Here is a catch, due to our webcam and OpenCv’s limitation higher width and height value will not work). The output will be saved at the same location as the program file.

time.sleep(3)
still_frame = 0
for i in range(60):
    ret, still_frame = cap.read()
    still_frame = np.flip(still_frame, axis=1)

Now, Here comes our only use of time library.

time.sleep(3) let our webcam wait a little(3 sec), before working. This is done for many reason . One of them is that, after opening the webcam doesn’t work well. It needs some time to adjust with the light.
After this , we define a variable named still_frame and assign it a value.
Now we load the image of all things before camera(just our background).This will be used as our reference image. cap.read() does this job for us. It returns a boolean True or False value depending upon whether our camera is one or off. This also returns an image which is saved as the variable still_frame.
Finally the last line flip the image along y-axis, as previously saved still frame was reversed due to camera’s recording.
step-3 and step-4 are written inside the for loop which is running 60 times, because we want to capture the background i.e., our surrounding as clearly as possible. You can just load the still_frame a single time(but i have the word single!!), but it will make our result a little worse. So, I will suggest run it for at least 50 times.

Idea behind doing these steps

Suppose your room is like this. So, Until now , what we have done is scan this image and stored it inside the variable still_frame.

This is our background i.e., a still frame(remember we have taken the image 80 times to make it better).

So, After running the program (few seconds later),I come into the picture. That means our current frame is changed.

I come inside the frame wearing a blue T-Shirt. This blue shirt is our Invisibility Cloak. So, To create the illusion of being invisible, what we can do is, just remove the blue pixels from the new image(current saved frame) and replace those frames with the frame of the image saved as variable still_frame. For this purpose first I have created a variable for storing our background data. I hope all the previous steps are cleared.

while True:
    ret, img = cap.read()
    if ret == True:
        img = np.flip(img, axis=1)
        hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

Now, Let’s see what we have done in this block.

Here the line while True: keeps the command inside it run continuously. It is done as we want to be live(Live capturing|onak dabi).
Now as I have discussed above, we need a current frame, were you will be standing with a sheet of specific colour(I will use Blue colour). So, to capture live (get a frame each instance) we use cap.read(). Remember cap is already defined before.
In the next step, what we do is first see if the camera is one. It is checked by the if ret==True: command. As the ret is only true if the camera is on. So, all under if command will only work if the new frame is captured.
In the last 2 lines i have just flipped it as before and converted the image in HSV(Hue- Saturation-Value) colour format.

Why HSV ?

You may ask why HSV ?, Why not RGB ?. The answer is simple. Machines are not that great in understanding RGB. I mean It needs more complex programs to do that. And also there all things related to visibility depends upon all three of them and because of it, This RGB is very sensitive and also hard to separate different low or high brightness colour.

But in HSV only Hue (just a single parameter) defines colour. For this reason it is used as single variables are mostly easy to use.

In HSV colouring method, hue ranges from 0 to 360. But OpenCv’s limitation is 2⁸(from 0 to 255). For this reason, the ranges of colour in OpenCv and original colouring system are a little different.

For our purpose, I will choose a Blue. So, to find the value of hue of blue what we will do is first notice where blue colour is in the original wheel. It is in the range 180 to 240. So, in opencv the range of blue is almost 90 to 120(i will take it as 130 for more better result). If you want to use red, then It is in the range 0 to 10 and also 170 to 180. Notice It is the case, as it is on both side of zero. So, We have 2 different ranges.

The Saturation in HSV in ranges from 0 to 100 . Saturation describes amount of gray in a particular colour. Here maximum of saturation is 255. When It is 0 , It means gray. For low values of saturation, many colours(like red, pink) can’t be separated by computer, so we will use a range of 50 to 255 for our program. Similarly Value(represent Brightness) in HSV will be taken in the range of 50 to 255.

lower_blue = np.array([90, 50, 50])
upper_blue = np.array([130, 255,255])
mask1 = cv2.inRange(hsv, lower_blue, upper_blue)

In this block of code, first two line represent blue colour using 2 array. 1st value is HUE, 2nd one is SATURATION and last one is VALUE.

cv2.inRange is used to create a mask. This takes hsv image as argument(i.e., already saved current image) and then it finds all parts with blue colour(of given range) in the frame current frame and then create separate out the blue sheet covered area(it will treat any blue colour equally). Here only one range is used. But when you are using Red colour sheet, at that moment we have to define another mask (as red is inside 2 different ranges) and then we have to add them using OR logic i.e., mask1 = mask1 + mask2,where, mask2 is another mask for other range.

As you can see, if we apply this to our new frame, It will be like this. See, Our machine is only selecting the blue shirt(ignore those noise i.e., those white dots).

All other colours are removed. Now we just have to remove the selected pixels(this white ones) from our current image and replace this frame’s pixel with our background frame’s pixels.

Before this, we have to apply Morphological Transformation on mask1. This is done to remove noises i.e.,those white dots. You can even run the program without it but the result will be shitty.

mask1 = cv2.morphologyEx(mask1, cv2.MORPH_OPEN, np.ones((3, 3), np.uint8),iterations=2)
mask1 = cv2.morphologyEx(mask1, cv2.MORPH_DILATE, np.ones((3, 3), np.uint8),iterations=2)
mask2 = cv2.bitwise_not(mask1)result1 = cv2.bitwise_and(img, img, mask=mask2)
result2 = cv2.bitwise_and(still_frame, still_frame, mask=mask1)
finalOutput = cv2.addWeighted(result1, 1, result2, 1, 0)

In this block, the first 2 line removes noises using a matrix transformation. So, We will increase the smoothness.

2.Now in the next line, we are using the bitwise_not operation. The main purpose of this is to create a mask of all the remaining things, which are not selected by mask1. As you can see in the image, here all other things are selected and only my T-shirt is not selected.

3. Now we just use the bitwise_and to take the region common to our new frame and mask2 i.e., all region except our blue cover(my T-shirt) and then we take the region common to both mask1 and our still_frame i.e., the pixels of the places where my T-shirt is (but the image of the background) .

4. Finally we add those up by using addWeighted .

And this does the trick. Here is my output. Due to low quality light and webcam, the output is not too great. You can change the blue colour according to your used colour. Like in this range lower_blue = np.array([100, 60, 60]) upper_blue = np.array([140, 255,255]).

The whole code is given below with a output video of one of my friend who have used it. Like her if you use correct background and light , then you can even beat Harry-Potter’s original stunt.