iPhone X could be a turning point for face recognition
Tomorrow morning, Apple will announce iPhone X, most likely with Face ID. Several years after Google quietly inserted face recognition as a screen unlock option in Android (neither secure, nor easy to setup), and several months after Samsung introduced face recognition in Galaxy S8 (which got immediately defeated), Apple finally has a shot at bringing face recognition to the world of consumers. If the rumors of iPhone X being equipped with a front-facing 3D camera are true, then my money is on Apple succeeding.
At Vcognition, we spend a lot of time trying to solve real-world problems using computer vision. Face recognition is one of the first problems we looked at. We developed a highly successful algorithm by combining some of the traditional computer vision techniques with deep learning. Our goal was to identify faces in video footage from commodity network cameras with access control (door or car entry) as the primary market. This problem, also called unconstrained face recognition, is one of the harder problems in computer vision.
Compared to the unconstrained variant, mobile face recognition is an easier problem to solve, at least from the standpoint of accuracy. What makes it challenging though is the limited memory and compute power on the mobile device. Not only is robust face recognition a compute-intensive operation, the device’s limited resources are also needed to perform a real time live-ness check on the subject to avoid spoofing. If Apple wants to avoid the fate of Samsung Galaxy S8 for its face recognition feature, the iPhone has to be able to distinguish between a live human face and the video (not just a picture) of a human face shown to it on another screen. And all this needs to happen in real time, i.e. within milliseconds.
A 3D camera can provide a huge assist to Apple’s face recognition efforts. Depth information will not only improve the accuracy of recognition, but also help identify and protect against video based attacks more reliably. Another likely benefit could be speed. The availability of an additional dimension of data (in this case, depth information) can simplify the number of steps in an algorithm or lower the number of layers needed in a model, in order to achieve the same accuracy as before. Of course, additional data means additional IO; so there is certainly a bit of trade off here but, for the most part, I think getting depth data will speed up the operation.
We are excited and looking forward to Apple’s event tomorrow. Anything that can pull computer vision (especially, face recognition) out of the spooky realm of surveillance and bring it to use cases of convenience and delight, will be a welcome development for everyone in computer vision.