A Brief Introduction to Facial Recognition (Part 1)

Image via Freepik

Introduction

One of the most interesting avenues unlocked by Artificial Intelligence and Analytics is Facial Recognition, powered by AI & ML algorithms. We see this application in use on a daily basis- in smartphones, security stations etc.

In this introductory blog, I will provide a quick walkthrough of the technology involved in Facial Recognition.

What is Facial Recognition anyway?

Facial Recognition is a recognition technique used to detect faces of individuals whose images are saved in the data set. Despite the fact that other methods of identification may be more accurate, Facial Recognition has always remained a significant research point because of its non-invasive nature and its ease of use.

Steps involved in Facial Recognition

There are various algorithms that can perform Facial Recognition, but their accuracy might vary. Now, let us understand how Facial Recognition can be performed using Deep Learning.

Here, we make use of Face Embedding, where each face is converted into a vector– this technique is called Deep Metric Learning. Let me further divide this process into three simple steps for easy understanding:

  1. Face Detection: The very first task we perform is detecting faces in the image or video stream. Now that we know the exact location/ coordinates of the face, we extract this face for further processing.
  2. Feature Extraction: Now that we have cropped the face out of the image, we extract features from it. Here, we will use face embeddings to extract the features out of the face.

A neural network takes an image of the person’s face as input and outputs a vector, which represents the most important features of a face. In Machine Learning, this vector is called embedding, and thus we call this vector as Face Embedding. Now, how does this help in recognizing faces of different persons?

While training the neural network, the network learns to output similar vectors for faces that look similar. For example, if I have multiple images of a face at different time ranges, some of the features of this face might change over time, though not drastically. So in this case, the vectors associated with the face are similar or rather, they are very close in the vector space. Take a look at the below diagram for a rough idea:

Now after training the network, the network learns to output vectors that are closer to each other (similar) for faces of the same person (looking similar). The above vectors now transform into:

We are not going to train such a network here as it takes a significant amount of data and computation power to train such networks. We will use a pre-trained network done by Davis King on a dataset of ~3 million images. The network outputs a vector of 128 numbers, which represents the most important features of a face.

Now that we know how this network works, let us see how to use this network on our own data set. We pass all the images in our data set to this pre-trained network in order to to get the respective embeddings and save them in a file for the next step.

3. Comparing Faces: Now that we have saved the face embeddings for every face in our data set, the next step is to recognize a new image that is not in our data. The first step would be to compute the face embedding for the image using the same network we used above, and then compare this embedding with the rest of the embeddings we have. We recognize the face if the generated embedding is closer or similar to any other embedding as shown below:

Here, we passed two images– one of the images is of Vladimir Putin and the other is of George W. Bush. In our example above, we did not save the embeddings for Putin, but we saved the embeddings of Bush. Thus, when we compare the two new embeddings with the existing ones, the vector for Bush is closer to the other face embeddings of Bush. In contrast, the face embeddings of Putin are not close to any other embedding, hence the program cannot recognize him.

Now, you might be interested to learn how this example would play out using real data. In Part 2 of this blog series, I will discuss the same with an example.

If you have reached this far, thank you for reading this blog, I hope you found it insightful 😃 . Give us a follow for more content on technology, productivity, work habits, and more!

--

--