Growing up as a 90's kid, I have always been fascinated with the technology gadgets and systems depicted in the Sci-Fi movies of that time. Be it the fancy GAP store in Minority Report, where you are greeted with an AI or the first Mission Impossible movie with its bio-metric security.
Fast forward to 2018, if we look at all the latest phones in the market these days, fingerprint readers and facial recognition is now a norm. Inspired by the latest technical advancements, I wanted to experiment with building my own facial recognition software or a chat bot to translate text to speech. As I was doing my research, I realized that building a web application with these functionalities is relatively straight-forward, given the powerful tools we have from cloud providers like AWS, Google etc; like baking a cake using a prepackaged cake mix.
Below is my recipe of how I leveraged some of the above services to add facial recognition and text-to-speech functionality to my web application. For my experiment, I used two AWS services: Facial Rekognition and Polly in order to build a login system. Amazon Rekognition is a service that makes it easy to add image analysis to applications and gives the ability to find similar faces in a large collection of images. AWS Polly is a Text-to-Speech service that synthesizes speech that sounds like a human voice from text. The system authenticates the user based on a successful face match and plays a personalized greeting. Similar services are also provided by other cloud providers like Google Cloud and Microsoft Azure. AWS is most widely used and I decided to use that for my application built using Node JS/ React.
In order to leverage the AWS APIs, the very first step is to have AWS Software Development Kit installed. I did that by running the following command:
npm install aws-sdk
Once we have the package installed, the next step is to setup the credentials to talk to the services. This requires signing up for an account with AWS and creating an access key in their IAM tool. I am injecting this access key from my application’s config file in the first few lines of the code snippet.
Once the setup is complete, we then need to create the instances of the services we are using. In the code snippet below, I am initializing three services : S3 for storing the images and the personalized greetings, Rekognition for Facial Recognition to match a given face against the face collection and Polly, which is a text-to-speech utility to create the personalized greeting for each user.
Once the setup is done, we now need to create a face collection with the known faces. This is a two-step process: First, we need to store the image in an S3 bucket. For my example, the image was captured by the Webcam in png format (using the library react-webcam) which I uploaded to S3 container using the code snippet below. The “image” is the serialized binary format from the webcam that is being passed as the parameter to AWS “S3.putObject” method below. This method persists the image in S3 in the bucket name provided. The image name and the bucket name is then used in the second step to upload the image to the face collection in the method “uploadImageToFaceCollection”. This two-step setup is done when the user registers to use the web application.
Below is the code snippet to load the WebCam image to S3 bucket. The imageKey is used in the next step to add this image to the Face Collection.
The image name from the previous step is used to upload that image to the Face Collection using AWS “Rekognition.indexFaces” method in the next step. Creation of the Face Collection is a one-time step and is done using AWS “Rekognition.createCollection”. The collection id provided during the collection creation is sent as an argument to the AWS “Rekognition.indexFaces” method along with the S3 bucket and image name. A FaceId is provided upon successful addition to the face collection. This FaceId is stored with the rest of the user information in the database and is used to identify him/her when they login.
So far, we have created a face collection for all the registered users of the web application. In this step, we will see how a user can login to the application using facial recognition. Upon user’s login, the image captured by WebCam is searched against the face collection of all the registered users. The input parameter “image” in the following code snippet is the serialized binary format from the webcam. This is searched in the AWS Face collection using AWS “Rekognition.searchFacesByImage” method. The method requires the id of the collection to be searched for and the threshold for the match. The Face ID of the matched face is returned upon successful completion. This Face ID is matched with the database records to identify the user and log him/her in.
As we have now seen, in three easy steps, we can create a face recognition application, which has multiple applications — from login mechanism to searching for faces in family photos. As mentioned previously, I used this feature to build login functionality for my own web application: https://jarvis-stackathon.herokuapp.com/
In my next blog, I will talk about the AWS Polly, a Text-to-Speech service that can be used to build personalized greetings for your application or to build your own chatbot. Stay tuned…