Android and 3D Cameras. Facial Recognition with Fraud Protection

Vladimir Shal’kov
Surf
Published in
8 min readAug 6, 2020

Hi, my name is Vladimir Shalkov, I am an Android developer at Surf.

Recently, our team had to implement a facial recognition system with fraud protection on Android. I will be sharing with you some of the most curious aspects of this work along with code examples and helpful links. I am sure that you will learn something new from this article, so make yourself comfortable and let us begin.

Facial recognition systems are becoming increasingly popular as more and more devices use the face unlock feature, and more tools become available for developers.

Apple is actively using FaceID in their products and they even added an API for developers to be able to access this functionality. FaceID is considered fairly secure and it can even be used for unlocking banking apps. Android SDK did not have a ready solution until quite recently. Despite the fact that device manufacturers added the face unlock feature to their operating systems, the developers were still unable to use it in their apps. Besides, the security was not exactly its strong suit.

Not so long ago, the FingerprintManager class responsible for unlocking apps using fingerprint got deprecated on API 28 and higher to encourage developers to use BiometricPrompt. This class contains logic related to biometrics, including facial identification. However, you cannot use it in every smartphone as, according to Google, is it only supported on devices with a high security rating.
Some devices do not have a built-in fingerprint scanner. Manufacturers abandoned it due to a higher level of protection against fraud in facial recognition that became possible thanks to the ToF(Time-of-flight) sensor. This technology is capable of building a depth map and thus increase the system’s resistance to hacking.

Requirements

The application we were working on had a functionality of an access control system that used facial recognition as a means of identification. Special algorithms check whether a face belongs to a real person. You can add a new user to the database right from the device by taking a photo and entering a name. If you want to determine whether a certain person is in the database, you can search it by photo taken with the device in real time. The algorithms check for similarities with the faces already in the database and display the information about this person in case of success.

Our main goal for this project was to ensure maximum security. We had to minimize opportunities for bypassing the facial recognition system, for instance, by holding a photo in front of the viewfinder. To achieve this, we decided to use an Intel RealSense 3D camera (model D435i) with a built-in ToF sensor able to get all the data needed to build a depth map.

We had to use a tablet with a large screen diagonal as our working device. It did not have a built-in battery and required constant connection to an electrical outlet.

There was another significant limitation — we had to work in offline mode. This made using cloud services for facial recognition impossible. Apart from that, it is unreasonable to write facial recognition algorithms from scratch given the amount of time and labor. We asked ourselves, why reinvent the wheel when there are ready solutions available? All things considered, we chose the Face SDK library by 3DiVi.

Acquiring images from the Intel RealSense camera

At first, we had to get two images from the 3D camera: one in color and the second containing a depth map to be later used by the Face SDK library for further calculations.

To get started with the Intel RealSense camera in an Android project, you have to add a dependency called RealSense SDK for Android OS which is a wrapper for an official C++ library. You can read about initialization and displaying the image from the cameras if you turn to the official samples. It is quite simple, we will not dwell on this. Let us go straight to the code for acquiring images:

By using FrameReleaser() we get individual frames from the video stream with the Frame type. ApplyFilter() allows you to apply various filters to frames.

In order to acquire a frame in the desired format, you have to convert this frame to the appropriate type. In our case, the first frame is VideoFrame and the second is DepthFrame.

If you want to display the image on the device’s screen, use the upload() method. Additionally, you have to specify the frame type you want displayed, that would be the image from the color camera in our case.

Converting frames to images

The next step is acquiring images from VideoFrame and DepthFrame in a desired format. We are going to use these images to determine whether the face in the image belongs to a real person and then add information to the database.

Image formats:

  • Colored image with the .bmp extension acquired from VideoFrame
  • Depth map with the .tiff extension acquired from DepthFrame

In order to convert the frames to images, you will need OpenCV, an open source computer vision software library. What you have to do is form a Mat object and convert it to a desired format.

To save a color image, create a matrix with the CvType.CV_8UC3 type, then convert it to BRG to make the hue look natural.

Use the Imgcodecs.imwrite method to save it to the device:

You have to do the same thing for DepthFrame except that the matrix must have a CvType.CV_16UC1 type as the image will be constructed from the frame containing data from the depth sensor.

Saving an image containing a depth map:

Working with the Face SDK library

Face SDK contains a large number of software components but we only need some of them for the purposes of this article. Similar to RealSense SDK, this library is written on C++ and it also has a wrapper for a more convenient work with Android. Face SDK is not free but there is a test license available for developers.

Most library components can be configured using XML configuration files. The algorithms may vary depending on the configuration.

To get started, create an instance of the FacerecService class. It is responsible for initializing other components whereas the parameters transfer links to DLL libraries, configuration files and licenses.

Next step is creating objects of the FacerecService.Config and Capturer classes.

The Capturer class is responsible for facial recognition. The manual_capturer.xml configuration allows you to use algorithms from the OpenCV library, namely Viola-Jones, a frontal face detector that uses Haar-like features for facial recognition. The library provides a ready set of XML files with configurations that differ in terms of recognition quality and running time. Less rapid methods boast the best performance when it comes to recognition quality. To recognize faces in profile, use common_lprofile_capturer.xml. There is plenty of configs, you can find their descriptions in the documentation. For the purposes of our project, we used the common_capturer4_singleface.xml config. It is a configuration with a lowered quality threshold that always returns no more than one face.

In order to find a face in a picture, use the capturerSingleFace.capture() method which receives an image byte array containing a person’s face:

The RawSample object stores information on the face that has been found in the database and contains a set of various methods. For instance, if you call getLandmarks(), you can get anthropometric feature points of the face.

Determining whether the person is real

In order to verify whether the person in the image is real or it is merely a photo being held up to a face detection camera, Face SDK offers the DepthLivenessEstimator module that returns enum with one of four values:

  • NOT_ENOUGH_DATA — too many missing values on the depth map
  • REAL — face belongs to a living person
  • FAKE — face is a photograph
  • NOT_COMPUTED — calculation failed

Initializing the module:

Determining whether the face belongs to a real person:

The getLivenessState() method receives links to images — a color image and a depth map — as parameters. Now we have to create RawImage from the color image. This class provides raw image data as well as optional information for cropping. The depth map is used to generate DepthMapRaw which is a depth map registered in accordance with the original color image. This is necessary for calling the estimateLiveness(originalRawSimple, depthMapRaw) method which returns enum containing information on whether the face in the picture belongs to a living person.

Let us elaborate on how DepthMapRaw is generated. One of the variables is called depth_data_ptr which is a pointer to depth data, but as we know, there are no pointers in Java. To acquire a pointer, use the JNI function that accepts a link to the depth map as an argument:

To call the code written in C in Kotlin, you have to create a class of the following type:

System.loadLibrary() receives the name of the .cpp file containing the readDepthMap() method, in our case, native-lib.cpp. On top of that, you have to set the external modifier which indicates that the method was not implemented in Kotlin.

Facial identification

Identifying a person in the picture is no less important. Face SDK allows you to implement it using the Recognizer module. The initialization goes as follows:

We are using a configuration file called method8v7_recognizer.xml which claims the highest recognition speed but at the same time the recognition quality is lower than that of the methods 6v7 and 7v7.

Before you can identify a person, you have to create a list of faces first which will be used for finding a match based on a sample photo. In order to implement this, create Vector out of the Template objects:

Use the recognizer.processing() method to create Template. RawSample will be passed on as a parameter. After the list with face templates has been generated, add it to Recognizer and save the resulting TemplatesIndex which is needed for a quick search in large databases:

At this point we have already created the Recognizer object containing all the necessary data to perform identification:

The recognizer.search() function returns a result where we can get the index of the found element, match it with a list of faces from the database and identify the person. In addition, we can find the amount of similarity, a real number from 0 to 1. This data is given in the Recognizer.MatchResult class under the scope variable:

In conclusion

No doubt, systems like this will become ubiquitous: doors will automatically open when approached and your office coffee machine will always choose your favorite grind size.

Android SDK is currently getting updated with an API that allows developers to work with facial identification systems but it is just at the early stage of development. Speaking of an access control system using an Android tablet paired with Face SDK and the Intel RealSense camera, I can point out a great deal of flexibility and scalability. You are not bound to a particular device — you can connect the camera to any modern smartphone. You can expand the range of supported 3D cameras as well as connect several cameras to a single device. There is a way to adapt the written app to Android Things and use it in your smart home. If you look at the features of the Face SDK library, you can even add facial identification to a continuous video stream and determine people’s gender, age and emotions. This functionality creates a vast room for experimentation. And, as we can say from our own experience, you should not be afraid to experiment and challenge yourselves!

--

--