Getting Started on Object Detection with OpenCV
With the current technology trends, computer vision has become an important entity in the technology field leading to limitless computer innovations. Think about computer vision from this perspective : As humans, our eyes are an important part of the body and so is embedding vision to computers/machines so as to allow them to see.
NB: There are many frameworks and tools that you can use to work on computer vision like IBM visual recognition, Microsoft Azure, openCV and many more…
In this write up, we are going to learn how to detect objects using openCV in python on a streaming video.
NB: you can also use an image which we will cover in another write up, for now we will identify objects specifically the face in a video stream.LIVE!!
Requirements:
- Linux OS
- Python3
- pip3
- opencv-python
- Webcam or Camera
- An IDE or editor for writing the code
INSTALLING openCV
To install openCV for python in linux open up your terminal and type the command bellow:
NB: I am working from my Desktop directory
pip3 install opencv-python
Nice,see it’s that easy to get started using python openCV package. Now lets get coding!!
CODING BEGINS
Open up your favourite editor and paste the following code, then I will explain every line.
import cv2#https://github.com/Itseez/opencv/blob/master/data/haarcascades/haarcascade_frontalface_default.xml#face haar cascade detection
face_cascade = cv2.CascadeClassifier(‘<dir to the face.xml file>/face.xml’)cap = cv2.VideoCapture(0)while(True):
#Capture frame-by-frame
ret, frame = cap.read() #Our operations on the frame come here
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
#faces multiscale detector
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
#loop throughout the faces detected and place a box around it
for (x,y,w,h) in faces:
gray = cv2.rectangle(frame,(x,y),(x+w,y+h),(255,0,0),2)
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(gray,’FACE’,(x, y-10), font, 0.5, (11,255,255), 2, cv2.LINE_AA)
roi_gray = gray[y:y+h, x:x+w]
#Display the resulting frame
cv2.imshow(‘black and white’,gray)
if cv2.waitKey(1) & 0xFF == ord(‘q’):
break# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
Lets discuss the code…
First we import the openCV library at the first line.
In openCV an object is detected using Haar Cascade code which is in an xml file. Let me explain. If you have used IBM visual recognition before, you understand the process of annotations, where we have multiple images that we want to train and create a model from right?
In IBM visual recognition,in the labelling process we prepare the data/images by drawing boxes around the specific images and labelling them, and the resulting .xml file from that process is what is used to identify the objects when testing and using the model because it contains specific details like width, height, maybe colour and all those elements.Hope we are still together…hehehe
In openCV the .xml file is generated using Haar Cascade which is a classifier, so the .xml in this case for face detection is saved in the variable called face_cascade.
There are a couple of already generated xml files here.Get an xml file that you want and copy paste it into a file and save it.
Then we now create another variable called cap which will hold the streaming video captured on our webcam. The .videoCapture(0) method comes with openCV and the 0 specified in the method indicates that we are going to use the default webcam on our computer. so if you are using another external camera change that to 1.
In the while loop, what we are saying is while there is a video being captured, store the video in a variable called frame the convert the video into a gray scale video(you can also use it as it is, if you don’t need a gray frame). Then we use the detectMultiScale() method to detect the face using the face.xml which contains the coordinates which was our Haar Cascade file.
In the for loop, what is happening is we are going through all the detected faces and drawing a rectangle using the .rectangle() method around it and appending a name using the .putText() method and even adding a font on the name!
In openCV we show the result using .imshow() method then we assign the letter q to quit or close the window when we want.
Finally we release the webcam.
DEMO
To run the file in python. open the terminal and type
python3 <name of the file>.py
I changed the name FACE to my name for fun..hehehe
That’s it, Hope everything was clear and if not, don’t hesitate to ask on the comment section and leave a clap for more to come, this only the beginning
Thank you. You are free to join us in creating amazing stuff, just email us at devligenceltd@gmail.com
Author: Collins .H. Munene
CEO. Devligence Ltd
Youtube: Artificial Intelligence projects