The technology behind Face Filters
We are all guilty of spending time trying on new filters on Instagram and Snapchat, from dog faces, flower crowns we’ve come as far as having on a perfect face of makeup with a single click. The user end of this application may look silly but the technology behind these filters isn’t. Let us decipher the engineering behind these face filters and how they work.
A Ukrainian startup called ‘Looksery’ came up with the technology behind filters. It allows users to customize and adjust their facial features during video chats and for photos.
Snapchat acquired it in September 2015 for 150 million dollars. It is believed to be the largest technological purchase in Ukrainian history.
They have augmented reality filters that utilize the massive and swiftly growing field of “computer vision” which is an application that uses pixel data from a camera so as to identify objects and decipher 3D space. Computer vision is a multidisciplinary field that focuses on how computers decode a large amount of data, it approaches how computers can be made for gaining high-level understanding from digital images or videos. The engineering aspect shows that it seeks to automate tasks that the human visual system naturally does. Computer vision is now used in convenience stores, driverless car testing, daily medical diagnostics, and monitoring the health of crops and livestock, and is also the reason that people can get fake freckles and colored hair with a single click!
The specific area of Computer Vision that filters use is called Image processing. Image processing is the transformation of an image by performing mathematical operations on each pixel on the provided picture.
Face detection is not a new technology. Facebook has utilised that technology for a long time. The digital cameras that people have owned before even using DSLRs also had the ability to detect faces. So it’s common knowledge that it’s not an extremely impossible task. But the extent to which Snapchat and Instagram filters find the face might be a little over your digital SLRs league. The broadly used technology is a consolidation of Histogram of Oriented Gradients (HOG ) and Support Vector Machine (SVM) that bring about mediocre to comparatively good detection ratios given a good quality image.
The first step in the process is face detection, this is something that the human brain is fantastic at but when a computer sees a face or any image, it sees different color codes on a blank screen in each pixel.
The computer processes each image as a large matrix of numbers which are codes, and each combination of the number represents a different color. The face detection algorithm goes through this code and looks for color patterns and areas of contrast that would represent a face. Different parts of the face can easily give away numerous important specifics. For example, the area surrounding the bridge of the nose is darker than the bridge. The forehead is lighter than the eye socket which is darker than the rest of the face, and the center of the forehead is even lighter than its sides.
The groundwork for this technology was laid down by a facial detection tool called the Viola-Jones algorithm, which works by repeatedly scanning through the images, collecting data, calculating the difference between the grayscale pixel values below the black boxes and the white boxes. When it has found an adequate number of matches for facial features in the pixel data in a certain part of an image, the computer can identify a face in the photo. The algorithm works well with a frontal facing but is unable to detect side faces, and it’s how digital cameras have been putting boxes around faces for years.
But in order to apply a virtual lipstick or the perfect winged eyeliner, the application needs to do more than just detect the face. It also has to locate particular facial features, which requires complex calculations.
This could take a lot of time, but Snapchat created a statistical model of a face by manually pointing out different borders of the facial features on hundreds, sometimes thousands of sample images. When you click your face on the screen, these already predefined points align themselves, scaling and rotating it according to where it already knows your face is located. This statistical model is called the “active shape model” and it looks something like this.
But it’s not an exact match so the model breaks down the data from individual pixels around each of the points, looking for edges defined by contrasts of darkness and brightness. From the training images, the model has a template for what the bottom of your lips should look like, for example, so it looks for areas of contrast to know precisely where your jawline, eyebrows, eyes, lips, and other facial features are in your image and adjust the point to match it. Once these points are located, the active shape model is modified in any way that seems suitable.
Because some of these individual guesses might be wrong, the model can correct and smoothen them by taking into account the locations of all the other points. Once your facial features have been precisely detected, it uses those points as coordinates to create a sophisticated mesh.
That’s a 3D mask that can move, scale, and rotate along with your face as the video data comes in for every frame and once they’ve got that, they have the ability to do a lot with it. They can deform the mask created to change your eye color, face change, add accessories, even set animations that trigger when you move your eyebrows or as soon as you open your mouth.
This whole facial feature recognition process is concluded as soon as you see that white net right before you choose your filter. The filters can then distort certain areas of the provided face by enhancing them or adding something on top of them.
The main concepts of this technology are not what makes it attractive. But rather its ability to run them in real-time from a mobile device makes it interesting. That level of processing speed is a newborn development in this field. Now that you know the technology behind these face filters you can enjoy clicking pictures with them even more
References: