Real Time Object Detection using YOLOv3 with OpenCV and Python

Darshan Adakane
Analytics Vidhya
Published in
4 min readNov 8, 2019
Real time object detection: Umbrella,person,car,motorbike detected using yolov3

In the previous article we have seen object detection using YOLOv3 algorithm on image. In this article, lets go further and see how we can use YOLOv3 for real time object detection.

We can solve this problem in two ways. One is using CPU and other using GPU.

CPU has advantage that we need not install any additional resources, installations. We can right away use OpenCV for this. But the disadvantage is that it is extremely slow (it depends what configuration CPU you are running but yeah it is slow). Recommended for beginners.

GPU on the other hand has advantage of having video graphic processor, hence faster speed. But the downside is that we need to compile many libraries manually and configure many things before to start utilising for our problem definition.

Lets keep this tutorial to use CPU for real time object detection. In the last tutorial we worked with single image, while now we will be using series of images (i.e. video) in OpenCV as input.

We will recap the code before loading image. We import cv2, numpy libraries. Then we load darknet architecture in net and in classes we store all the different object from coco.names file. And get the last layer from net so as to identify object in final layer.

Now we load capture video from webcam. Hence we use cv2.VideoCapture(0). Here 0 meaning it is from the first webcam. Also, to detect how many frames we are processing per second, we will import time library. We using frame_id to count of frames. To get elapsed time, we subtract starting.time from time.now. We have defined font to be HERSHEY_PLAIN

Then we load all frame of the video running in while loop and store each frame using cap.read and perform operation of what we did for each image in last article. Store details of each frame in height,width,channels. To process image faster we will reduce the blob size from 416x416 to 320x320. The accuracy will reduce a little with this change. On the other hand, if we increase 608x608 it will accurate but very slow in detection.

In above code, fps is frame per second we calculate as number of frames divided by elapsed time. We will put this value as text on the object detected. At the end we have used key=27 (which is Esc key on keyboard) which will break the loop and stop the execution if this key pressed.

There is one last thing that i would share is about confidence.Remember it was the threshold of 0.5 that we used, if we increase it would mean we will get more accuracy on detection but less number of object and vice versa. We are also showing this confidence value on box as well.

The test video ran from webcam detected following objects:umbrella,car, person and motorbike. Check out the output file by clicking here (file name is Webcam_ObjectDetection.MOV)

To speed up the detection process, apart from reducing the blob image size, there is another way. The way is by using Tiny YOLO. In this, we will pass a different weights and cfg file. It will have less convolutional layers than original yolov3 cfg file. To use this just replace file name in the code as follows, rest of the code remains same.

You can find those files here:

yolov3-tiny.weights: Download from here

yolov3-tiny.cfg : Download from here

So if you have are working on low configuration of CPU, tiny yolo is best option. But nowadays, with the computing power increasing in latest laptop/computer (some are GPU enabled too), I would recommend directly using regular yolov3.

I hope this would help in implementing yolov3 real time object detection algorithm. I have tried to make it understandable from beginners mindset. Please let me know if you have any questions,comments.

Convolution Rocks!

Happy Learning !

[You can find complete code on Github. Star if you like it. Thanks]

--

--