Let’s ‘Face’ it! — Face detection with Firebase ML Kit

Ketan Vichare
Sixt Research & Development India
3 min readFeb 3, 2020

To enrich the SIXT ride experience and establish trust and credibility with customers, we proposed to display the driver’s photo in a SIXT booked ride. Customer can track a driver before the arrival for pick up and besides, having a driver photo beforehand smoothens the overall ride experience!

So, we introduced a feature to upload a driver’s profile photo in SIXT Driver Application.

But the main challenge while collecting driver’s photo involved verifying if the photo belongs to a human or not. Otherwise, it wouldn’t serve any purpose. To reduce the dependency on manual intervention for validation of these photos, we integrated Firebase ML Kit. Firebase ML Kit helped us in pre-validating a captured image before uploading it to the server.
Our validation were based on the factors:
1. Photo should contain only a single face.
2. Face area must be at least 30% of the entire image.
3. Photo should not contain any text.

I’ll talk about the text validation in a separate post. For now, let’s see how we integrated ML Kit face detection library.

With ML kit SDK, face detection occurs locally. No APIs are fired.

First of all, include the ML Kit libraries to your project using build.gradle. (Use updated version numbers instead of placeholders mentioned in sample)

// ML Kit dependencies
api "com.google.firebase:firebase-ml-vision:$versions.ml.vision"
implementation "com.google.firebase:firebase-ml-vision-image-label-model:$versions.ml.image"
implementation "com.google.firebase:firebase-ml-vision-face-model:$versions.ml.face"
implementation "com.google.firebase:firebase-ml-model-interpreter:$versions.ml.interpreter"

Face detection requires FirebaseVisionFaceDetectorOptions instance to be set up. We can decide values for ClassificationModel and LandmarkModel.

ClassificationModel (can be enabled by setting value as ‘ALL_CLASSIFICATIONS’), will provide information regarding smile and eyes closed/open in a face.

LandmarkModel will provide information about face landmarks like, MOUTH_BOTTOM, MOUTH_RIGHT, MOUTH_LEFT, RIGHT_EYE, LEFT_EYE, RIGHT_EAR and so on.

Also, PerformanceMode can be set up in options. Better the accuracy of recognition, more will be the processing time required.

Initialise FirebaseVisionFaceDetector with above modes as below.

val options = FirebaseVisionFaceDetectorOptions.Builder()       .setClassificationMode(FirebaseVisionFaceDetectorOptions
.ALL_CLASSIFICATIONS) .setLandmarkMode(FirebaseVisionFaceDetectorOptions.NO_LANDMARKS)
.build()

detector: FirebaseVisionFaceDetector = FirebaseVision.getInstance().getVisionFaceDetector(options)

Further, we need to form instance of FirebaseVisionImage and pass that instance to the detector. We can do that by calling our own process(bitmap) method.

fun process(bitmap: WeakReference<Bitmap>) {
detectInVisionImage(bitmap.get(), FirebaseVisionImage.fromBitmap(bitmap.get()!!))
}
private fun detectInVisionImage(originalCameraImage: Bitmap?, image: FirebaseVisionImage) {
detector.detectInImage(image).addOnSuccessListener { results ->
onSuccess(originalCameraImage, results)
}.addOnFailureListener { e -> onFailure(e) }
}

Post image processing, we receive the respective callbacks for success or failure.

In Success callback, we get a list of FirebaseVisionFace instances. The number of faces detected depends on the size of this list.
Each item carries information like SmilingProbability, LeftEyeOpenProbability, RightEyeOpenProbability, HeadEulerAngleY, HeadEulerAngleZ and the Face Landmarks as mentioned above.

Also, there is a Rect object-BoundingBox, which gives length and height of face area. Using this, we can calculate face area covered by the image. This took care of our nearly 30 percent of the face area validation.

I would suggest, it is better to avoid using ultra high resolution images while performing the detection to avoid high processing times. On the other hand, using lower quality images will also lead to poor face detection results.

Pro tip: When you integrate ML Kit libraries, it is highly recommended to use App bundles (.aab) and not just signed apks. Due to inclusion of ML Kit libraries, apk size could become bulgier by 20+ Mbs. App bundle will cut down those extra Mbs by removing non related .so files, making your App size slimmer.

Another challenge was to have a UX flow in place to ensure that driver the uploads the photo. We achieved that by displaying a pop up reminder when App is used for the first time in a day. This enhanced the adaption rate of the feature.

Along with the face detection, ML Kit provides Text Detection, Barcode scanning, Language Identification and Smart replies out of the box. It also supports features like Landmark detection and Image Labelling which require api calls to process the images.

Like what you read? Don’t forget to share this post on Facebook, WhatsApp and LinkedIn.

--

--