Methods for face detection and face recognition - A review.

A. Face Detection

lieng phy
4 min readOct 14, 2018

1) Viola Jones object detection framework (haar cascades)

The framework is created by Paula Viola and Micheal Jones in 2001 which can be used for a variety of object detection but primarily face detection.

Viola-Jones requires full view frontal upright faces. The algorithm has four stages:

  1. Haar Feature Selection
  2. Creating an Integral Image
  3. Adaboost Training
  4. Cascading Classifiers.

With some similar properties of human faces, Haar features form matches from facial features: location and size of eye, mouth, bride of nose, and the oriented gradients of pixel intensities. There are 38 layers cascaded classifiers are used to train to obtain the total number of 6061 features from each frontal face.

Although, there is a downside in this algorithm somewhat, it is very robust with high detection rate and also really fast; and still can be acceptable in real-time application. There are some pretrained classifiers that can be found here

2) Histogram of Oriented Gradients (HOG)

HOG became widespread in 2005 when Navneet Dalal and Bill Triggs, researchers for the French National Institute for Research in Computer Science and Automation display their work on HOG, which is a much reliable solution. The algorithm is also a features extractor for the purpose of object detection. Instead of considering the pixel intensities like Viola Jones method, the technique counts the occurrences of gradient vectors represent the light direction to localize image segments. The method uses overlapping local contrast normalization to improve accuracy.

3) Comparision

To observe the differences in the accuracy of face detection of the two feature descriptors, I use a test set collected from Google images which include 7 faces. Figure 1 shows the output of some test images.

Fig. 1. Output of HOG and Haar Cascade classifier

As you can see, HOG covers most of the difficult case when the humans wear glasses or have a slight change in face side. The bounding boxes of HOG are also near to the face ground truth boxes.

Furthermore, I also test the performance of two methods on Raspberry Pi 3. And, there is a huge difference between the speed of two algorithms as shown in Table I.

Because HOG method is too precise that covers most of the cases even when people wearing glasses or showing face to either side. Most of the frames it can detect faces which results in slowing down dramatically the whole process. If you have powerful computer HOG is not a bad choice because it has low false positive rate. However, ViolaJone detector seems to be more practical on less powerful device like mobile or raspberry.

One can also use cnn for face detector, but I does not take it into account in this paper.

The script used for comparison can be found from here

B. Face Recognition

1) Dlib

Dlib library provide a pretrained models that is comparable to other state-of-the-art face recognition models with the accuracy of 99.38%. The model is a Resnet network with 29 conv layers. About 3 million faces was trained to obtain the network.

Since the models use deep learning to obtain face embeddings (face encodings), the process of obtaining face encodings takes most of the time. With its high accuracy, we really need to take it into account.fa

2) Local Binary Patterns Histograms (LBPH) Face Recognition

In computer vision, Local Binary Patterns has been found to be a powerful feature for texture classification which is first described in 1994 by T. Ojala and et al. The algorithm is then applied for Face Recognition application in 1996. The ideas of the algorithm is to find the local features of an image by comparing with its neighbor pixels.

https://docs.opencv.org/2.4/_images/lbp.png

A 3x3 window is moved whole over the image, and at each stop, LBP is defined by comparing the intensities of the center pixel with its eight surrounding neighbors. The neighbor with intensity value greater than a center pixel are marked by one and the others by 0, then the binary code is converted into a decimal one. After completing a list of local binary patterns, the local histogram is constructed. This local histogram will be used for face recognition.

Despite the lower accuracy compare to Dlib library, this method provides a much higher frame per second. The method can be applied for applications that do not require a high level of security.

3) Comparison

A comparision demo of LBPH and Dlib using Ronaldo faces:

LBPH (left) and Dlib (right)

Source code for real time face recognition by Dlib and LBPH. Generally, I prefer Dlib because of its high accuracy.

In this tutorial, we have learnt about some face detection and face recognition methods. Each has its advantages and disadvantages, one should use either method depends on requirements. You can also make a combination of face detection methods and face recognition methods depending on accuracy, CPU power, performance.

Hope you guys enjoy~

--

--

lieng phy

Love to explore machine learning applications and different programming languages.