Facial landmarks detection with dlib and haar cascade

Md Tarequl Islam
Analytics Vidhya
Published in
4 min readNov 30, 2020

--

Photo by Chris Ried on Unsplash

Recently I was working on a project that involved facial landmarks detection using Python. Python offers a library called dlib, which is very much suitable for this job. To find any facial landmarks, one first has to extract the face from the image and then use that extracted ROI (region of interest) of the face to get the landmarks. Now detecting a face from an image is quite an old trick in the computer vision sector. The most famous process is using a haar cascade classifier which, returns a multidimensional numpy array. The number of elements in that array is equal to the number of faces present in the image. Each element of this array contains 4 integers. The first two indicating the top-left corner followed by the width and height of the ROI.

Dlib also provides a face detecting function called get_frontal_face_detector(). This function returns an array of arrays of rectangle objects. A rectangle object represents a rectangular area of an image. Each rectangle object holds four values, meaning it also returns the coordinates of ROI that contains the face but in a different format. So there are three main differences between the haar cascade classifier and the dlib detector. First of all, while the haar cascade detector returns a multidimensional numpy array, the dlib detector returns an array of rectangle objects. Secondly, the haar cascade detector returns the top-left corner with width and height, while the dlib returns the top-left corner(first two values of the rectangle object) and the bottom-right corner(last two values of the rectangle object). Thirdly, the dlib function only takes the grayscaled image as the parameter but the haar cascade classifier takes two more parameters, scaleFactor and minNeighbors. The scaleFactor parameter specifies how much the image size is reduced at each image scale and the minNeighbors parameter specifies how many neighbors each candidate rectangle should have to retain it. Basically, with these two parameters, you can control how strictly or loosely you want to detect faces in the image. So you can have more flexibility with haar cascade than the dlib function. For example when the value of minNeighbors is 5 the face of the guy behind Johnny Depp also gets detected but that is not the case when minNeighbors is 10.

Face detection with haar cascade with minNeighbors = 5 (left) and minNeighbors = 10 (right)

Now I was curious to see whether the haar cascade and dlib library provide the same result e.g. the same ROI. But I observed a bit of difference. The haar cascade extracts more area than the dlib function.

Haar cascade vs Dlib

Now back to facial landmark detection. The dlib library provides a function named shape_predictor() that takes two parameters, the first being the grayscaled version of the image and the second is a dlib rectangle object that holds the coordinates of the face area. Then we can get the feature points and using those feature points we can get the facial landmarks. In my case, I want to detect the facial landmarks using the results of the haar cascade classifier. But the problem is, dlib library’s shape_predictor function takes dlib rectangle object not a numpy array. So to use the haar cascade results with shape_predictor function we have to convert the numpy array into dlib rectangle object. That way we can have the flexibility of the haar cascade class.

Here is the code

# importing libraries
import cv2
import numpy as np
import dlib
# function to convert dlib.full_object_detection to numpy array
def shape_to_np(shape, dtype="int"):
coords = np.zeros((68, 2), dtype=dtype)
for i in range(0, 68):
coords[i] = (shape.part(i).x, shape.part(i).y)
return coords
# reading an image and converting it to grayscale
image = cv2.imread('johny.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
# loading the classifiers with respected files
face_cascade = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")
faces = face_cascade.detectMultiScale(gray,scaleFactor=1.10,minNeighbors=5)
# looping through each detected faces and drawing rectangle around the face and circles around the feature points
if len(faces)>0:
for x,y,w,h in faces:
cv2.rectangle(image, (x,y), (x+w, y+h), (0, 255, 0),3)
# creating the rectangle object from the outputs of haar cascade calssifier
drect = dlib.rectangle(int(x),int(y),int(x+w),int(y+h))
landmarks = predictor(gray, drect)
points = shape_to_np(landmarks)
for i in points:
x = i[0]
y = i[1]
cv2.circle(image, (x, y), 2, (0, 255, 0), -1)
cv2.imshow('image',image)

You can also check my Github repo for detecting facial landmarks with dlib and find the necessary files.

Here are some useful links if you want to know about dlib library and haar cascade classifier

  1. Object Detection: Face detection using haar cascade classifier
  2. Dlib documentation
  3. Facial landmarks with dlib, OpenCV and Python

--

--

Md Tarequl Islam
Analytics Vidhya

SQA Engineer in Automation Solutionz. Machine Learning, Computer Vision, Deep Learning, Image Processing, AI enthusiast.