Facial Detection pt.1

Harsh Kumar Khatri
Harsh Kumar Khatri
Published in
5 min readMay 9, 2020

Facial detection is one industry which is on a boom from past few years and there is is lot of work being done on it. We use this for many things in our day to day life like we use the camera to identify the face and unlock the screen of our phone. Many industries use facial detection as a mode for the attendance system in their company. Many schools internationally to take attendance of students in school.In this article i will be telling you how you build and train your own facial detection system and identify images and label them according to you.

Note:-We will have multiple parts for this article covering the training and testing part separately.

Let’s dive into it.

First we will be installing the dependencies which we will be needing for training our model for the images which we have stored.

pip install opencv-contrib-python pip install numpy pip install PIL pip install pickle

Once we have the required dependencies installed we will not be importing them in our main python file

import cv2 import os import numpy as np from PIL import Image import pickle

The next and one of the essential thing is setting the correct path for the images which we have stored in our storage. I have stored them inside the images folder inside the main root directory of this project.

BASE_DIR = os.path.dirname(os.path.abspath(__file__))

We have stored the main directory which contains the root path for the project directory

Next we will be storing the directory of the images in the image_dir variable

image_dir = os.path.join(BASE_DIR, "images")

It will be storing the absolute path of the of the images inside the specified variable.

Next we will be will be initializing the cascade classifier from the xml file which is the pre trained data. It helps us in identifying the faces inside the images which we have stored in the folder.

face_cascade = cv2.CascadeClassifier('cascades/data/haarcascade_frontalface_alt2.xml')

I have stored the pre trained data inside the cascade/data/ directory.

Next we will be initializing our recognizer which will help us in recognizing the faces. This recognizer is present withing the face module for cv2.

recognizer = cv2.face.LBPHFaceRecognizer_create()

Next we have initialized some variable which are are a variable having a integer value, one dictionary,and two lists.

current_id = 0 label_ids = {} y_labels = [] x_train = []

Next we will be will be be having a loop with the help of which we will be extracting out root location, the directory and the files which are present. ‘os.walk’ will be walking through the files present inside the absolute file which we have stored above.

for root, dirs, files in os.walk(image_dir):

Next we will be taking out each file from the files which we have obtained above.

for file in files:

Next we will be ensuring if the images which we are having are of the correct format or not.

if file.endswith("png") or file.endswith("jpg") or file.endswith("jpeg"):

We will be allowing only ‘png’,’jpg’ and ‘jpeg files to be used and rest other files will be ignored.

Now to get the individual path of each image we will be joining the file with the root directory.

path = os.path.join(root, file)

We have used the path join feature from our os.

Next we will set the label to the name of the folder in which they are present. In my case i have the folder name with the name of the person whose images are stored inside the folder. You can also do that it will be handy to deal with the images and the folders.

Now we will be storing the label inside the dictionary we have created above. We will be adding a integer value corresponding to this variable. This is done to avoid biasing which means that different people from different locations can have different languages and can cause in biasing so we use integer instead. We have used the current_id to represent the value of the integer. The value of current_id will be changed for the images from different folders.

if not label in label_ids: label_ids[label] = current_id current_id += 1

The above code will add the label in the dictionary if it is not pre present in it.

Now we will be having a copy of this dictionary in a variable id_ which we will be using in the case of detection. We will be having the names of persons corresponding to the the integer values..

id_ = label_ids[label]

Now we will be opening the images i the gray scale mode from the path which we have.We will d this from the image module which is present inside the PIL library.Also we have specified the size which we will be using further.

pil_image = Image.open(path).convert("L") size=(550,550)

Now resizing the image with the resize method and we will also remove the aliasing present inside our image.

final_image = pil_image.resize(size, Image.ANTIALIAS)

Further we will be converting the final image after resizing into a numpy array.

image_array = np.array(final_image, "uint8")

This one is on of the most important step in which we will be detecting the faces from the images.

faces = face_cascade.detectMultiScale(image_array, scaleFactor=1.5, minNeighbors=5)

We have passed the array which we have created above. The scalefactor is used to make comparisons. Min neighbors has been used to check the surrounding things and the image will be analyzed regarding the things which we mentioned int it.

Now to identified the region of interest from the faces which we have detected.

for (x,y,w,h) in faces: roi = image_array[y:y+h, x:x+w]

We have coordinates x,y and height and width h,w which are used in identifying the ROI.

Finally we will be appending them in the the list. The first list with the name ‘x_train’ will be having the roi and in the list 2 named ‘y_labels’we will have a label corresponding to the ROI.

Next we will be storing the trained data into a pickle file by opening the file in wb mode which is writing byte.

with open("face-labels.pickle", 'wb') as f: pickle.dump(label_ids, f)

Now we will be training on the basis of the training data we have stored inside the ‘x_train’ list and on the basis of the list ‘y_labels’ and will be finally saving the trained data in a ‘.yml’ file.

recognizer.train(x_train, np.array(y_labels)) recognizer.save("face-trainner.yml")

With this we have completed the training part of our this project and stored the files in their respective extensions.

The complete code i have used in this project is given below for you reference.

import cv2 import os import numpy as np from PIL import Image import pickle BASE_DIR = os.path.dirname(os.path.abspath(__file__)) image_dir = os.path.join(BASE_DIR, "images") face_cascade = cv2.CascadeClassifier('cascades/data/haarcascade_frontalface_alt2.xml') recognizer = cv2.face.LBPHFaceRecognizer_create() current_id = 0 label_ids = {} y_labels = [] x_train = [] for root, dirs, files in os.walk(image_dir): for file in files: if file.endswith("png") or file.endswith("jpg") or file.endswith("jpeg"): path = os.path.join(root, file) label = os.path.basename(root).replace(" ", "-").lower() #print(label, path) if not label in label_ids: label_ids[label] = current_id current_id += 1 id_ = label_ids[label] #print(label_ids) #y_labels.append(label) # some number #x_train.append(path) # verify this image, turn into a NUMPY arrray, GRAY pil_image = Image.open(path).convert("L") # grayscale size = (550, 550) final_image = pil_image.resize(size, Image.ANTIALIAS) image_array = np.array(final_image, "uint8") #print(image_array) faces = face_cascade.detectMultiScale(image_array, scaleFactor=1.5, minNeighbors=5) for (x,y,w,h) in faces: roi = image_array[y:y+h, x:x+w] x_train.append(roi) y_labels.append(id_) #print(y_labels) #print(x_train) with open("face-labels.pickle", 'wb') as f: pickle.dump(label_ids, f) recognizer.train(x_train, np.array(y_labels)) recognizer.save("face-trainner.yml")

In the next article i will be covering how you will load the the data from the files and will be recognizing the images from the live video and will be marking them with the labels.

Also the link tot he project will be attached in the next article.

If you have any doubts then please comment it down, i would be happy to help.

Originally published at https://harshblog.xyz on May 9, 2020.

--

--