Eye Aspect Ratio(EAR) and Drowsiness detector using dlib
--
In this article, I will show you how to determine facial landmarks using the dlib library, how to calculate EAR(Eye Aspect Ratio), and use the concept of EAR to detect drowsiness.
Before you begin with the code part of this article, you would have to install dlib library in python. There are some prerequisites for installing dlib library and I would recommend you to check this article.
What are facial landmarks that dlib detects
The dlib library can be used to detect a face in an image and then find 68 facial landmarks on the detected face.
I will not go into details about how does it detect a face and locate facial landmarks. The order of the detected facial landmarks will always be the same irrespective of image dimensions or face size, which means 1–17 will always represent an outline of the face. 43–48 would always represent the left eye. The exact code of how this is done comes later in this article.
How to find Eye Aspect Ratio(EAR)
If you notice, each eye is represented using 6 landmarks points.
The EAR for a single eye is calculated using this formula:
The more the EAR, the more widely eye is open. We would decide a minmum EAR value and used this to decide if the eye is closed or not.
Here is the utility function that would return the EAR for a single eye
def eye_aspect_ratio(eye):
p2_minus_p6 = dist.euclidean(eye[1], eye[5])
p3_minus_p5 = dist.euclidean(eye[2], eye[4])
p1_minus_p4 = dist.euclidean(eye[0], eye[3])
ear = (p2_minus_p6 + p3_minus_p5) / (2.0 * p1_minus_p4)
return ear
The Main code(Drowsiness Detector)
- We will start by importing the necessary python libraries
import cv2
import dlib
import imutils
from imutils import face_utils
from scipy.spatial import distance as dist
2. We then declare some global configuration variables, that would be used in the rest of our code:
FACIAL_LANDMARK_PREDICTOR = "shape_predictor_68_face_landmarks.dat"
MINIMUM_EAR = 0.2
MAXIMUM_FRAME_COUNT = 10
FACIAL_LANDMARK_PREDICTOR: path to dlib’s pre-trained facial landmark predictor. You can download this file from here.
MINIMUM_EAR: Minimum EAR value above which the eyes would be marked as open otherwise closed. This parameter you might want to tune as per your requirement. Try to find EAR in different scenarios and then determine the value. Also, note that this EAR is not for a single eye but the cumulated EAR for both eyes.
MAXIMUM_FRAME_COUNT: The value of EAR changes very quickly. Even if you blink your eye the EAR will drop quickly. But blinking does not mean drowsiness. Drowsiness would be a situation where a person has closed his eye(his EAR is very less) for let’s say 10 consecutive video frames. So this variable tells the maximum number of consecutive frames in which EAR can remain less than MINIMUM_EAR, otherwise alert drowsiness.
3. We then instantiate dlib’s faceDetector(that will detect the face in an image) and landmarkFinder(which will find 68 landmarks in the detected face)
faceDetector = dlib.get_frontal_face_detector(
landmarkFinder = dlib.shape_predictor(FACIAL_LANDMARK_PREDICTOR)
webcamFeed = cv2.VideoCapture(0)
4. We then find the start and end values of landmark ids for both the eye. You can do it manually also(37–42 for the right eye and 43–48 for the left eye) but using face_utils you can get these values by just passing the eye name.
(leftEyeStart, leftEyeEnd) = face_utils.FACIAL_LANDMARKS_IDXS["left_eye"]
(rightEyeStart, rightEyeEnd) = face_utils.FACIAL_LANDMARKS_IDXS["right_eye"]
5. The final part is where the real stuff happens
EYE_CLOSED_COUNTER = 0
try:
while True:
(status, image) = webcamFeed.read()
image = imutils.resize(image, width=800)
grayImage = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
faces = faceDetector(grayImage, 0)
for face in faces:
faceLandmarks = landmarkFinder(grayImage, face)
faceLandmarks = face_utils.shape_to_np(faceLandmarks)
leftEye = faceLandmarks[leftEyeStart:leftEyeEnd]
rightEye = faceLandmarks[rightEyeStart:rightEyeEnd]
leftEAR = eye_aspect_ratio(leftEye)
rightEAR = eye_aspect_ratio(rightEye)
ear = (leftEAR + rightEAR) / 2.0
leftEyeHull = cv2.convexHull(leftEye)
rightEyeHull = cv2.convexHull(rightEye)
cv2.drawContours(image, [leftEyeHull], -1, (255, 0, 0), 2)
cv2.drawContours(image, [rightEyeHull], -1, (255, 0, 0), 2)
if ear < MINIMUM_EAR:
EYE_CLOSED_COUNTER += 1
else:
EYE_CLOSED_COUNTER = 0
cv2.putText(image, "EAR: {}".format(round(ear, 1)), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
if EYE_CLOSED_COUNTER >= MAXIMUM_FRAME_COUNT:
cv2.putText(image, "Drowsiness", (10, 50), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
cv2.imshow("Frame", image)
cv2.waitKey(1)
except:
pass
We declare a variable EYE_CLOSED_COUNTER which records the count of consecutive frames in which the EAR was less than the MINIMUM_EAR.
We do some resizing to the image and convert it to grayscale.
Detect all the faces in the image using dlib’s faceDetector:
faces = faceDetector(grayImage, 0)
Loop over each face and find the 68 landmarks using the landmarkFinder of dlib:
faceLandmarks = landmarkFinder(grayImage, face)
Get the landmarks for the left and right eye and then send them to eye_aspect_ratio() to get EAR values for the left and right eye:
leftEye = faceLandmarks[leftEyeStart:leftEyeEnd]
rightEye = faceLandmarks[rightEyeStart:rightEyeEnd]
leftEAR = eye_aspect_ratio(leftEye)
rightEAR = eye_aspect_ratio(rightEye)
Find the cumulative EAR for both eyes:
ear = (leftEAR + rightEAR) / 2.0
Use the eye landmarks to show eyes on the image:
cv2.drawContours(image, [leftEyeHull], -1, (255, 0, 0), 2)
cv2.drawContours(image, [rightEyeHull], -1, (255, 0, 0), 2)
If for the current frame the cumulative EAR is less than the MINIMUM_EAR, increase the counter else reset the counter as we are interested in consecutive frames only.
if ear < MINIMUM_EAR:
EYE_CLOSED_COUNTER += 1
else:
EYE_CLOSED_COUNTER = 0
If the EAR was less than the MINIMUM_EAR for the specified number of consecutive frames, this implies drowsiness.
if EYE_CLOSED_COUNTER >= MAXIMUM_FRAME_COUNT:
cv2.putText(image, "Drowsiness", (10, 50), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
Once everything is in place, and you run the code, it will generate an output something like this:
Final words
In this article, we used dlib library for determining facial landmarks. The dlib library is pretty simple to use and works smoothly even without GPU. But there might also be other free-to-use libraries available on the market. Mediapipe is one such example. If you are not aware of this library you can check my previous article covering this library. At the time of writing this article, the Face module of the Mediapipe library is not available for Python. At some later time, if the Face module becomes available, it is definitely worth giving it a try as for sure the facial landmarks detection of Mediapipe would be better than the dlib. You can find the source code for this article on Github here.