Google MLKit: Quick and easy Face Detection in Android

Published in

Gravel Product & Tech

7 min readApr 8, 2022

Introduction

The core of facial recognition systems has been developed over the years, the first similar system was developed in the 1960s as a computer application. Its utilization has been used in many industries outside the regular-like implementation of technology. One of its early utility applications was for preventing criminal activity by recognizing peoples’ faces so they cannot multiply identical faces to register under different names (for fraud prevention) in the USA.

As more cutting-edge technology has developed significantly, this majorly affects the development of facial recognition utilization. Nowadays, facial recognition systems can be easily found in every aspect of our lives, starting from handheld devices, security systems, to smaller sensors in vehicles. This rapid development phenomenon caught our eyes and might attract one’s mind “How could you do that?”. Therefore this article will be “Demystifying the Face Recognition”.

Demystifying the Face Recognition

I bet some questions might appear in one’s mind after reading the intros above, one of them would probably like, “Great! Now, how to implement it?”. First thing first, let’s look at some steps to achieve such goals. To achieve full face recognition implementation, we need to separate this into 3 major steps:

Face Detection (What this article covers)
Deep Neural Network
Result Classification Method

The diagram below will hopefully help you understand how each step corresponds to what:

As stated above, Face Detection comes first. It may be pretty self-explanatory, however, you may find it confusing regarding “What is the difference between Face Detection and Face Recognition ?”.

Face Detection is the process to see/identify faces from an image. On the other hand, Face Recognition is the process of differentiating faces and classifying them to identify a person. Therefore, to distinguish the two identical (but not the same) jargons, we can point out on the main process one is aimed for.

Referring to Diagram 1, Face Recognition consists of 2 steps: Deep Neural Network, and Classification. On the other hand, the scope of this article only covers the Face Detection part.

Based on human point of view, differentiation between one face to others is pretty apparent, but not for computers. If computers were provided with an image, all they see is a bunch of 0 and 1 (binary); they do not understand why this array of 0 and 1 is a face and why the others are not. Therefore, we are supposed to train the Face Recognition process with images with faces in them, so they translate the binary they see as recognition.

Face detection — also called facial detection — is an artificial intelligence (AI) based computer technology used to find and identify human faces in digital images. Face detection technology can be applied to various fields — including security, biometrics, law enforcement, entertainment, and personal safety — to provide surveillance and tracking of people in real-time. (Corrine, B. 2020).

This is where Google MLKit jumps to help. Inside the Google MLKit suite, there is this function called Face Detection Library. This tool helps us simplify the process of face detection by simply, well, doing it for us. We need to supply it with an image (or images) into it and ask the library to provide us with where the face/faces are located in the image. It then allows us to crop that particular area and gives us good old face data that we can use to supply our next step.

Now that we understand each term and where MLKit lands in this process, we can start to enter the nitty-gritty implementation. In this case, it is using the Android Platform.

Face Detection Implementation

Our Implementation for Face Detection is using Google’s MLKit library. This library allows us to detect faces in an image and also detect several parameters such as:

Key Facial Feature
Face Contour
See Expressions such as Smile and Open Eyes
Face Angle of Rotation
Bounding Box where the Face is located in an image

This implementation assumes that you are already familiar with Android Projects and will not specify how you get the Bitmap/Image we’re going to use to implement. You may add your own implementation of a camera, file browsing and/or network image as a source.

What we target is only to detect any face in a frame of video / in a single image, and extract the required face image into a separate, treated image that is usable for Deep Neural Network implementation to work. Each implementation of DNN will have different requirements, and we must consider which algorithm we use to extract the image.

By default, the face detection library allows us to set various options for face detection. Because we’re only needing Bitmap Face data with n x n dimension, we will use minimal settings that do not detect anything other than Bounding Box.

To add MLKit Face Detection support to an android project, you should add this dependency to the build.gradle file at the app level:

implementation 'com.google.mlkit:face-detection:16.1.2'

Great! after that, we need to set the FaceDetector client up and running.

This code below shows how we implement the faceDetector client for Facial Detection:

This faceDetector client is the main ingredient in our process. You can save this variable as injectable Singleton or a new instance in your Fragment/Activity. Your call.

Then we need to prepare the image(s) that we want to detect the face on. As mentioned above, you may implement your own code to source the image. This part assumes we already had the Bitmap variable necessary for the process.

As the diagram above shows, we want to convert this image to NV21 Byte Array. This is not strictly necessary, but this format is recommended to further increase the detection time required for the Face Detection to work.

This code below shows how we implement converting our Bitmap to NV21 ByteArray which then we convert to InputImage that we used to supply the faceDetector client we set up earlier:

As for how the Extension works, you can see the code below:

The whole process of face detection can be seen in the Flowchart below:

Step by step explaining the graph above:

First, We need to prepare the image source. this can be retrieved from a Camera or local Image. In Android what is required for this step is to acquire any Image in Bitmap Format.
Then we continue with Image Pre-Treatment is an optional step. By default, most bitmaps will use the ARGB-8888 format. This format would be fine when used with MLKit Face detection, but we can save some extra processing time by converting it into NV21 ByteArray format which would boost the detection speed significantly. This Bitmap or ByteArray shall then be converted to the InputImage variable that can be inserted into the faceDetector.
MLKit Face Detection step consists of inserting the InputImage into faceDetector. We should also attach addOnSuccessListener.
If there the images contains any face, then onSuccessListener will be invoked and will return the Face object which contains information as mentioned above such as BoundingBox, Landmark, etc. but the actual contains will follow the settings we set previously, if we set the Detector to LANDMARK_MODE_NONE then no landmark will be retrieved.
The next step if we assume that any face is detected from the source image is to extract just the face into the specific n x n dimension bitmap that we use for DNN Model. This can be achieved via BoundingBox that is provided inside the Face object we retrieved previously. This BoundingBox will contain the coordinate of which each point of a Rectangle that will contain the Face is placed on the source Image. The image below will show what is BoundingBox and where it is in an image, and how we cut the face image to be supplied into the DNN model.

Lesson Learned

These are the following points that you need to know about the Face Recognition process:

We can break the Face Recognition process into 3 major steps: Face Detection, Deep Neural Network, and Classification.
We have seen that Google MLKit helps us to do face detection easily to extract face data. You can use the extracted face data to feed the Deep Neural Network.

Alas, the lesson learned might be the last part of the Face Recognition article. But don’t worry, as we’ll talk more about Deep Neural Network on Mobile Devices using TensorFlow Lite in the next article. Stay tuned on Gravel’s page for updates.

Google MLKit: Quick and easy Face Detection in Android

Introduction

Demystifying the Face Recognition

Face Detection Implementation

Lesson Learned

Written by Muhammad Wyndham Haryata Permana