Face Liveness Detection through Blinking Eyes

Andi Sama
The Startup
Published in
16 min readJun 2, 2020

Detect the Presence of Live Human Face with Open Source Tools

Andi Sama CIO, Sinergi Wahana Gemilang with Andrew Widjaja and Cahyati S. Sangaji

Supporting files (Python Notebook, videos and images) are available in github.

Nowadays, Face Recognition has been deployed in various areas of applications, especially as an additional or a secondary measure in doing access control (for selected few applications, it has become a primary authentication method). Face recognition has been applied as an alternative or a new way to commonly used authentication method today: userid with a password or combined with some forms of captcha.

One of the challenges in implementing face recognition (for human) is detecting whether the face being presented is real or not, e.g. someone may present a photograph instead of presenting the actual face.

This article illustrates the practical ways of detecting human face, whether it is real or not based on the presence of blinking eyes. The input is acquired from a stream video camera or video file.

The following shows the sample of a detected blinking eye from a video file.

Face liveness detection through blinking eyes detection, applied to a video file stream in one scene of a Korean drama trailer (Original movie is streamed in Netflix as “Chief of Staff 2 Ep 1”). The captured frame shows blinking eyes have been detected on Kang Sun Young’s face, a role played by a female Korean artist: Shin Min Ah (Drama Vibe, 2019).

Preparing the Environment

The following shows the environment that we are using on Windows 10 Operating System. Previous articles were mostly implemented on Ubuntu or RHEL Linux distribution — either installed locally (on-premise) or in the cloud such as IBM IaaS Cloud, GCP (Google Cloud Platform IaaS) or AWS EC2 (Amazon Web Services, Elastic Compute Cloud). We are doing something different now, it’s full running on Windows 10.

import os, platform, sys, time
from datetime import date
print('OS name:', os.name, ', system:', platform.system(), ', release:', platform.release())
print("Anaconda version:")
!conda list anaconda
print("Python version:", sys.version)
print("Python version info: ", sys.version_info)
import cv2
print("OpenCV version:", cv2.__version__)
import numpy as np
print("numpy version:", np.__version__)
import tensorflow as tf
print("Keras, tensorflow version:", tf.keras.__version__, tf.__version__)
from tqdm import tqdm
from collections import defaultdict
from asm_eye_status import *
import face_recognition
print("Face Recognition version:", face_recognition.__version__)
import imutils
from imutils.video import VideoStream

The output:

OS name: nt , system: Windows , release: 10
Anaconda version:
# packages in environment at C:\Users\andis\anaconda3:
#
# Name Version Build Channel
_anaconda_depends 2019.03 py37_0
anaconda custom py37_1
anaconda-client 1.7.2 py37_0
anaconda-navigator 1.9.12 py37_0
anaconda-project 0.8.4 py_0
Python version: 3.7.7 (default, Mar 23 2020, 23:19:08) [MSC v.1916 64 bit (AMD64)]
Python version info: sys.version_info(major=3, minor=7, micro=7, releaselevel='final', serial=0)
OpenCV version: 4.2.0
numpy version: 1.18.1
Keras, tensorflow version: 2.2.4-tf 2.1.0
Using TensorFlow backend.
Face Recognition version: 1.2.3

To experiment with the code in this article, some preparations need to be done.

First of all, install Anaconda for Windows 10 (Anaconda, 2020) followed by OpenCV (Pranav Sreedhar, 2019). By default Python and Jupyter Notebook (Interactive Integrated Development Environment) will be installed with Anaconda.

First thing first, It may be wise to create a virtual environment before doing anything further, right from the installation of Anaconda. In Anaconda Prompt, use ‘conda create’ then ‘conda activate’ commands.

(base) C:\Users\andis>conda create -n myenv
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##
environment location: C:\Users\andis\anaconda3\envs\myenv
Proceed ([y]/n)? y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
# To activate this environment, use
# $ conda activate myenv
# To deactivate an active environment, use
# $ conda deactivate
(base) C:\Users\andis>conda activate myenv(myenv) C:\Users\andis>

The rest required modules can mostly be installed through ‘conda install package_name’ from Anaconda Prompt.

(myenv) C:\Users\andis>conda install pip

Selected few packages may need to be installed through ‘pip install package_name’, like ‘pip install face_recognition’ for installing library to encode and verify faces, by name.

(myenv) C:\Users\andis>pip install package_name

Once all the environment preparation is done, we can start Jupyter Notebook from Anaconda Navigator, then we can work with the Python code as shown before.

In general, especially in Python — we need to convert an image to an internal format that we can work on, and typically this will be converted to a NumPy array (Justin Johnson, Spring 2020). NumPy array is an nd-array (n-dimensional array).

An RGB image contains data in 3 dimensions (height, width, channel) like (768, 1024, 3) with 2,359,296 pixels in total (768 * 1024 * 3). Each pixel per channel has 8 bits (1 Byte) value ranging from 0–255. It means for each RGB pixel, it has 3 bytes (24 bits) of data (1 Byte for each channel: R, G, and B).

When these pixels data are converted to an nd-array, they are easier to manipulate. This is done through a combination of some tools that we are discussing here: Python Image Library (PIL), NumPy, and OpenCV. While the blinking eyes recognition is done through Keras, a deep learning framework based on the Tensorflow library.

Face Live Detection through Blinking Eyes

The code is a fork from (Jordan Van Eetveldt, 2019a) with a few modifications to run on Windows 10 (in which, Anaconda manages most of the packages/libraries) with OpenCV 4.2, Keras 2.2.4-rf and few supporting libraries. Note that Python’s NumPy library is used for image manipulations in the form of nd-array (n-dimensional array).

While in the previous illustration we see face liveness detection through blinking eyes detection with input from a video file, the following shows the video input from a different device (integrated camera in the laptop).

It shows face liveness detection through blinking eyes detection, applied to a video stream acquired from an integrated camera running on Lenovo Thinkpad T480 laptop. The captured frame demonstrates blinking eyes that have been detected (the sequence of open-closed-open eyes) on the author’s face. Name is shown if blink is detected (Author’s name is shown as it has been encoded previously (JPEG-file training images have been provided by generating a sequence of images for the author’s face).

When detecting blinking eyes, the Eye Icon is also displayed on the top right, along with displaying the person’s name (if the face has been encoded before).

The main pseudo-Python-code for face live detection through blinking eyes is shown below.

main():
# initialize all necessary things
(model, face_detector, open_eyes_detector,
left_eye_detector, right_eye_detector,
video_capture, images, source_resolution) = init()
# generates encoded vectors for all JPEG images under
# each 'defined name' directory
data = process_and_encode(images)
eyes_detected = defaultdict(str)
out = VideoWriter(out_filename, encoding, frame_rate,
source_resolution)
while True:
# get & detect faces & names, detect blinking eyes
frame = detect_and_display(model,
video_capture, face_detector,
open_eyes_detector,left_eye_detector, right_eye_detector,
data, eyes_detected)
# show the video with overlay boxes
# (face, detected blinking eyes) on the screen
if frame is None: break
out.write(frame)
show_VideoFrame('Face Liveness Detector - Blinking Eyes
(q-quit, p-pause)', frame)
key_pressed = waitforKey()
# 'q' to exit, 'p' to pause the video being played
if key_pressed is 'q'-uit,
then break
elif key_pressed is 'p'-ause,
then pause
video_capture.stop()
out.release()
cv2.destroyAllWindows()

The following shows actual Python code (the main() function) and it’s generated output during execution.

if __name__ == "__main__":
print("[LOG] Initialization...")
# input in init(video_source); 0:WebCam, 1:VideoFile
video_source = int(select_source())
(model, face_detector,
open_eyes_detector, left_eye_detector, right_eye_detector, \
video_capture, images, source_resolution) = init(video_source)
data = process_and_encode(images)

# Define output filename
out_dir = 'output/'
if video_source == 0: # camera
out_filename = out_dir + 'camera_face-blink_detect.mp4'
else: # video file
out_filename = out_dir + 'video_face-blink_detect.mp4'

# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*'XVID')
frame_rate = 5
out = cv2.VideoWriter(out_filename, fourcc, frame_rate, source_resolution)

eyes_detected = defaultdict(str)
imshow_label = "Face Liveness Detector - Blinking Eyes (q-quit, p-pause)"
print("[LOG] Detecting & Showing Images...")
while True:
frame = detect_and_display(model, video_capture, \
face_detector, open_eyes_detector,left_eye_detector,right_eye_detector, \
data, eyes_detected)
if frame is None:
break
out.write(frame)
cv2.imshow(imshow_label, frame)

# asama: modified to include p=pause
key_pressed = cv2.waitKey(1)
if key_pressed & 0xFF == ord('q'): # q=quit
break
elif key_pressed & 0xFF == ord('p'): # p=pause
cv2.waitKey(-1)
print("[LOG] Writing output file...", out_filename)
video_capture.stop()
out.release()
cv2.destroyAllWindows()
print("[LOG] All done.")

The generated output is as follows. It opens a video file as shown before. Note that, there are 45 JPEG images encoded in total with encoding total time about 23 seconds with an average of 1.9 iterations/second).

[LOG] Initialization...
Please select source:
0: Webcam
1: Videofile
1
3it [00:00, 3030.57it/s]
0%| | 0/45 [00:00<?, ?it/s]
[LOG] Opening default video file... C:\Users\andis\Code\FaceRecdata\Chief of Staff 2 Ep 1 Trailer.mp4
[LOG] Getting Video Resolution...
Video resolution (width, height) in pixels: (1280, 720)
[LOG] Collecting images...
[LOG] Encoding faces...
100%|██████████| 45/45 [00:23<00:00, 1.91it/s]
[LOG] Detecting & Showing Images...
empty frame detected! - camera closed or end of file?, exiting...
[LOG] Writing output file... output/video_face-blink_detect.mp4
[LOG] All done.​

A few more snapshots from the video file are shown below.

As for face recognition by name as shown before, the following file ‘asm_Generate face dataset.ipynb’ is the Python code to generate the face images, acquired directly from the integrated camera.

# Author: Andi Sama 
# Purpose: Generate face dataset through integrated camera
# Creation Date: April 13, 2020, finalized on April 15, 2020
import os, cv2
from imutils.video import VideoStream
def generate_faces(new_path, new_face):
video_capture = VideoStream(src=0).start()
count=1
while True:
frame = video_capture.read()
cv2.imshow("recording faces...", frame)
key_pressed = cv2.waitKey(500) # wait 0.5 second
filename = new_path + '/' + new_face + str(count) + '.jpg'
if not(key_pressed & 0xFF == ord('q')): # q=quit
cv2.imwrite(filename, frame)
count += 1
else:
break
cv2.destroyAllWindows()
video_capture.stop()
print("[LOG] recording done.")
status=1
return status
if __name__ == "__main__":
face_dir = 'faces/'
new_face = 'Andi Sama'
new_path = face_dir + new_face
# if sub-directory with new name does not exist, then create
cwd = os.getcwd()
if os.path.exists(new_path):
print('Sub directory: "', new_path + '" exists in', cwd, '- please remove it first')
else:
try:
os.mkdir(new_path)
print('Sub directory: "', new_path + '" created')
print('LOG: Generating images of face...', new_face)
if generate_faces(new_path, new_face):
print("success")
else:
print("failed")
except FileExistsError:
print('Sub directory: "', new_path + '" already exist’)

The output:

Sub directory: " faces/Andi Sama" createdLOG: Generating images of face... Andi Sama[LOG] recording done.success

Stage-1: Initialization

The initialization stage includes generating encoded vectors for all JPEG images within each defined names in ‘faces/’ directory. Encoded vectors are now stored in memory. These vectors can be stored in a No-SQL database such as MongoDB or Cloudant for example, so we do not have to reencode them every time we run the face recognition. Later, the database of encoded vectors can be retrieved when we want to do face comparison.

It also initializes the source device for getting video stream (switch the commented lines within ‘FaceRecognition.ipynb’ file in init() function to switch between two video sources: the code supports either getting the video stream from video file or a camera).

The base directory structure and the content of ‘faces/’ sub-directory are as follows. Note that ‘Jang Tae Jun’ and ‘Shin Min Ah’ sub-directories are created manually, while ‘Andi Sama’ directory is created by running the Jupyter Notebook’s file ‘asm_Generate face dataset.ipynb’.

(myenv) C:\Users\andis\Code\FaceRec>dir
Volume in drive C is Windows
Volume Serial Number is 7AA3-F000
Directory of C:\Users\andis\Code\FaceRec
04/16/20 07:49 PM <DIR> .
04/16/20 07:49 PM <DIR> ..
04/12/20 03:24 AM 3,331 asm_eye_status.py
04/15/20 07:01 PM 4,865 asm_Generate face dataset.ipynb
04/11/20 02:54 PM <DIR> data
03/31/20 08:44 PM <DIR> dataset
04/15/20 08:09 PM 16,657 FaceRecognition.ipynb
04/16/20 07:48 PM <DIR> faces
03/31/20 08:45 PM 601,660 haarcascade_eye_tree_eyeglasses.xml
03/31/20 08:45 PM 919,871 haarcascade_frontalface_alt.xml
03/31/20 08:45 PM 195,368 haarcascade_lefteye_2splits.xml
03/31/20 08:45 PM 196,169 haarcascade_righteye_2splits.xml
03/31/20 08:45 PM 50,350 lbpcascade_frontalface.xml
03/31/20 08:45 PM 190,720 model.h5
03/31/20 08:45 PM 3,256 model.json
04/11/20 07:53 PM 254 README.md
11 File(s) 2,182,501 bytes
5 Dir(s) 38,180,888,576 bytes free
(myenv) C:\Users\andis\Code\FaceRec>dir faces
Volume in drive C is Windows
Volume Serial Number is 7AA3-F000
Directory of C:\Users\andis\Code\FaceRec\faces
04/16/20 07:48 PM <DIR> .
04/16/20 07:48 PM <DIR> ..
04/15/20 07:00 PM <DIR> Andi Sama
0 File(s) 0 bytes
3 Dir(s) 38,180,401,152 bytes free

Stage 2: Recognize Face by Name & Detect Liveness of Human Face

Following the initialization process, the code is then entering a loop in which it is continuously extracting frames to be detected as human faces & names, as well as detecting the presence of blinking eyes. We can press either ‘q’ (quit) or ‘p’ (pause) key at any time while the video is playing on the screen.

The code has two main purposes: recognizing faces and detecting the liveness of recognized faces.

First, it recognizes human faces within an image (can be multiple faces) and faces’ location). Then, it associates recognized faces with names (if person names are provided as ‘directory name’ within ‘faces/’ sub-directory containing JPEG images). Finally, the code detects the liveness of the human face through blinking eyes.

1. Recognize human face by name

Faces are “trained” by providing enough samples of JPEG-type images in ‘faces/’ sub-directory. This is processed by the face_recognition library (need to install face_recognition library). Three functions are used within face_recognition library: face_locations(), face_encodings() and compare_faces()

  • face_locations(): detects the presence and locations of faces within an image.
  • face_encodings(): encodes an image into a vector of 128 features. In this case, two faces (actor and actress from the video file as shown before) have been encoded with about 20 JPEG images each (taken from Google image search).
  • compare_faces(): computes the distance between two embedding vectors to recognize face extracted from a webcam frame and compare its embedding vector with all encoded faces in the dataset. The closest vectors represent the same person.

The name of the sub-directory within ‘faces/’ sub-directory becomes the name of the person. Every “trained” face (every JPEG file) is encoded into a vector of 128 features.

2. Detect liveness of human face through blinking eyes

The liveness of the human face is detected by recognizing the pattern of “open-close-open” of eyes, a sequence indicating of blinking eyes.

The chosen model is LeNet-5, based on CNN (Convolutional Neural Network) which has been trained on the Closed Eyes In The Wild (CEW) dataset (Pranav Sreedhar, 2019). The dataset is composed of around 4800 eye images (in size of 24x24 pixels).

OpenCV pre-trained Haar-cascade classifier is used to detect faces & eye in realtime (OpenCV.org, 2019).

Train a Neural Network to Recognize Open/Closed Eyes in a Face

While the trained Keras’ model for eye detection has been provided, we can re-train the model using the following code (functions available in file ‘asm_eye_status.ipynb’). The sample output is provided when the model is trained (on CPU) on April 19, 2020, for 100 epochs. During a few minutes training, 2 model files will be generated in the current directory: ‘model.h5’ (Keras-based model) and ‘model.json’. The trained-model has training & validation accuracies at 99.79% (training loss: 0.0104) and 95.56% (validation loss: 5.0244e-06) respectively.

# Create a deep learning model to recognize open/closed eyes
epoch = 100
train_generator, val_generator = collect()
train(train_generator, val_generator, epoch)

The output:

Found 3779 images belonging to 2 classes.
Found 1067 images belonging to 2 classes.
[LOG] Initializing Neural Network...
Training neural network model for 100 epochs...
Epoch 1/100
118/118 [==============================] - 2s 19ms/step - loss: 0.5973 - accuracy: 0.6776 - val_loss: 0.4890 - val_accuracy: 0.7614
Epoch 2/100
118/118 [==============================] - 2s 20ms/step - loss: 0.4340 - accuracy: 0.8038 - val_loss: 0.4325 - val_accuracy: 0.8087
Epoch 3/100
118/118 [==============================] - 2s 20ms/step - loss: 0.3473 - accuracy: 0.8551 - val_loss: 0.1838 - val_accuracy: 0.8715
Epoch 4/100
118/118 [==============================] - 2s 19ms/step - loss: 0.3091 - accuracy: 0.8810 - val_loss: 0.2923 - val_accuracy: 0.8870
Epoch 5/100
118/118 [==============================] - 2s 19ms/step - loss: 0.2735 - accuracy: 0.8914 - val_loss: 0.1812 - val_accuracy: 0.8841
Epoch 10/100
118/118 [==============================] - 2s 19ms/step - loss: 0.1285 - accuracy: 0.9504 - val_loss: 0.2367 - val_accuracy: 0.9295
...
Epoch 50/100
118/118 [==============================] - 2s 19ms/step - loss: 0.0495 - accuracy: 0.9832 - val_loss: 0.2303 - val_accuracy: 0.9440
...
Epoch 75/100
118/118 [==============================] - 2s 19ms/step - loss: 0.0363 - accuracy: 0.9864 –
...
Epoch 85/100
118/118 [==============================] - 2s 20ms/step - loss: 0.0173 - accuracy: 0.9931 –
...
Epoch 90/100
118/118 [==============================] - 2s 19ms/step - loss: 0.0184 - accuracy: 0.9936 - val_loss: 0.5056 - val_accuracy: 0.9411
...
Epoch 95/100
118/118 [==============================] - 2s 20ms/step - loss: 0.0135 - accuracy: 0.9947 - val_loss: 0.0291 - val_accuracy: 0.9585
...
Epoch 98/100
118/118 [==============================] - 2s 19ms/step - loss: 0.0102 - accuracy: 0.9968 - val_loss: 0.0664 - val_accuracy: 0.9469
Epoch 99/100
118/118 [==============================] - 2s 19ms/step - loss: 0.0133 - accuracy: 0.9938 - val_loss: 0.0028 - val_accuracy: 0.9488
Epoch 100/100
118/118 [==============================] - 2s 19ms/step - loss: 0.0150 - accuracy: 0.9928 - val_loss: 1.7811e-04 - val_accuracy: 0.9488
Saving model to current directory...
Done.

The structure of neural network layers for our deep learning model to recognize open/close eyes is shown in a model plot below.

The model needs to be trained just once. Once the model has been generated, this part of the code can be marked as comments. In general, 20 epochs should be enough to achieve about 94–95% accuracy on the training & validation dataset.

The model is implemented in predict() function in ‘asm_eye_status.ipynb’, passing the image (for either left or right eye) and model, then evaluating whether the prediction (return code from predict() function) less than 0.1 or above 0.9. If prediction < 0.1, then we assume the eye is in the close state. If prediction > 0.9, then we assume the eye is in the open state. The pattern of ‘open-close-open’ that is indicating blinking eye (for either left or right eye) or blinking eyes (for both left and right eyes) is detected through this approach.

Notes on Modifications from The Original Code

Modifications to original code are applied in both ‘asm_eye_status.py’ file (originally ‘eye_status.py’) as well as in ‘FaceRecognition.ipynb’ file (originally ‘face_rec.py’).

In ‘asm_eye_status.py’ file, adjustments are made to convert deprecated functions in scipy.ndimage (imread) and scipy.misc (imresize, imsave) to be used in predict(img, model) function. Numpy library is used instead as a replacement library, especially for reshaping arrays that are acquired from the PIL image format.

README.md file is also modified, while the rest of the files are left as they are.

A few modifications are made in the main program ‘FaceRecognition.ipynb’ file. The summary of changes is as follows. Note that ‘faces/’ subdirectory is added to encode two Korean artists from the movie (Drama Vibe, 2019).

In init()

  • add support to read a video file using OpenCV, in addition to video stream from the camera.
  • get information on source resolution using OpenCV (either from camera or video file) but still keep using imutils VideoStream() for much better buffering instead of using slow OpenCV’s VideoCapture, for writing the processed video (with recognized face and blinking eyes) to an external file in ‘output/’ subdirectory.

In detect_and_display()

  • increase the scaling factor (of frame) in cv2.resize() from 0.6 to 1.0.
  • display current date & time on the top left of the frame.
  • when blinking eyes are detected, display an overlay eye icon on the top right of the frame. This is done through OpenCV’s addWeighted() function by overlying a new image to the existing image and adjusting its alpha channel.

In main()

  • add the ability to select the source (camera or video file) by calling the added select_source() function.
  • add the ability to pause when a video stream is acquired from a video file.
  • add the ability to write the processed input to the external file.

One Python file ‘asm_Generate face dataset.ipynb’ is added to generate images of faces that we want to recognize (by name). These generated images of face files reside within the directory of ‘faces/Your Name’, in which ‘Your Name’ needs to be defined within the ‘asm_Generate face dataset.ipynb’ file, e.g. “new_face = ‘Andi Sama’.”

With this, we conclude the article. And by considering a long-period of working-from-home situation, one more experiment is done by recognizing a face with the mask ON (see below). Surprisingly, face & name are still recognized, without retraining/re-encoding the face_recognition library with faces of masks dataset.

The Author with Mask ON. An experiment is done by recognizing face with the mask ON. Face & name are still recognized, without retraining/re-encoding the face_recognition library with faces of masks dataset.

What’s next then?

We have discussed an Artificial Intelligence (AI)-approach, a use-case of face liveness detection by identifying the presence of blinking eyes in humans. The input is taken from a video stream (camera), then the blinking eyes are detected through a trained deep learning model.

Being a Data Scientist (or just as an aspiring Data Scientist) requires continuous long-life learning in exploring and experimenting with datasets, as well as having lots of curiosity to keep ourselves updated on various types of algorithms and tools for multi-industry use-cases implementations.

As any other scientists in other disciplines often dedicate their lifetime in pursuing things in their area of research focus to keep updated and strive to be better than state-of-the-arts. It may seem almost impossible to achieve the defined goals at the beginning of the journey — however, with strong persistence and a lot of patients, at the end of the road, although not always true, all of the efforts that we have been doing will be worth it.

Becoming handy and having updated knowledge and experience with practical available products or Open-Source tools in doing analytics or machine learning / deep learning modeling would be considered as invaluable skills for a Data Scientist.

This time we discuss face liveness detection in humans by processing images coming from a video stream (camera) as well as from a video file. We are using Python programming language, along with OpenCV, NumPy, imutils, and Python Image Library (PIL) libraries with a deep learning model to detect the presence of face and blinking eyes. The model is created using Keras, a deep learning framework that is built on top of the TensorFlow library).

Well, let’s get started by doing something. And the right time is Now!.

References

--

--