How We Utilize Deep Learning Algorithms
The whole world is on the hype of machine learning, especially deep learning, as big data sets and computing power exponentially increases. More importantly, countless problems are being solved through deep learning — from computer vision to machine translation.
As BinaryVR provides facial tracking which is closely related with computer vision, we also consider deep learning algorithms as a very charming way to improve our product. Furthermore, we have been utilizing the algorithm for BinaryFace development and undertaking deep learning R&D for next versions. In this posting, we will briefly show you how deep learning algorithm is used to improve the performance of our facial tracker.
BinaryVR and BinaryFace
BinaryVR develops real-time facial tracking solutions both for VR and AR(mobile) based on the core technology of facial landmark detection. BinaryFace is our AR solution that tracks facial landmarks through video frames in real time through 2D camera input. In application, easy to say, BinaryFace allows users to apply AR effects in real-time such as SnapChat.
Where and why we utilize deep learning algorithms
The goal of BinaryFace is simply these — accurate, reliable, and fast. First of all, to build an accurate and reliable SDK, considerable data are required. As machine learning algorithms iteratively learn from data, we need to prepare data sets that consist of input(face images) and output(landmark annotation) sets. Here’s the example of the data set.
While retaining enough face images, we had to find a way to collect data with annotations like above. To generate clean annotated data, the first idea that comes to mind is to ask annotators for help. However, it is quite cumbersome and costly ineffective leaning on human labor to collect plentiful amounts of data. How can we secure massive clean data in comparatively short time?
Here is where deep learning starts to intervene in solving the problem. We developed an algorithm through deep learning that detects facial feature points from the input data. As long as we provides the algorithm face images, output data is automatically generated. Thanks to deep learning algorithm, the annotation database has been scaling up immensely supporting our SDK optimization in terms of speed and accuracy.
Deep learning algorithm & BinaryFace SDK
Some might be curious that if the algorithm with deep learning can generate the data, why we do not directly apply it to our tracker. As the question pointed out, we already developed a deep learning algorithm that can track facial landmarks in real-time and the algorithm is very accurate and reliable. However, the algorithm is not optimized for mobile phone application. As the deep learning applies artificial neural networks that contain multiple layers (which is the reason why it is called ‘deep’ learning), it needs higher computing power than other machine learning algorithms. So the actual threshold is in the computing power of regular mobile phones. In other words, the deep learning algorithm makes a mobile phone very slow and overloaded.
Other than accuracy and reliability, the key competitiveness of BinaryFace is its speed and lightness. It should be fast enough to track multiple faces in real-time and light enough for a regular mobile phone we use everyday. Our developers optimized BinaryFace for these requirements and now BinaryFace has acquired the speed of tracking multiple faces up to 20 in real-time without any overload even on low specification phones.
So it can be said that the two algorithms function same as detecting facial features but focus on different aspects since the problem that each of algorithms tries to solve is different.
Examples of Deep learning algorithm and BinaryFace SDK
BinaryFace provides robust performance across real-world situations in-the-wild such as pose, expression, occlusion and illumination changes. Most of the cases, BinaryFace and deep learning algorithm provide the same result thanks to the sophisticated engineering of BinaryFace SDK.
However, for some extreme cases with rare facial expressions, there is a bit of difference between deep learning algorithm and BinaryFace SDK. Here is a great example of extreme cases.
The Possibility of Deep Learning & Mobile AR
We believe that deep learning will have its own era in near future as the technology advances as well as BinaryVR’s capability with top-tier professionals specialized in ANN(artificial neural networks).
Mobile computing power will continuously improve and highly efficient deep learning algorithms will be unveiled. As Apple’s AR kit is now available, mobile phone will be one of the dominant platforms of AR and people will start to indulge varying AR contents!
For the days to come, BinaryVR has been undertaking deep learning R&D and preparing next versions of both VR and AR(mobile) SDK utilizing latest deep learning algorithms to provide the highest quality facial tracking solution for contents creators!
Follow us and check out our blog to be updated if you are interested in BinaryVR’s vision and technology! BinaryVR brings humanity into virtual world from the very front line of cutting-edge technology.