Integration of Amazon Rekognition to Media Asset Management System

Published in

firstlineoutsourcing

5 min readDec 6, 2021

Such AI services as facial recognition, speech-to-text transcription, object detection became popular and many customers want to add them to their products.

In this article, I’ll tell you about how to integrate Amazon Rekognition Service into a third-party Media Asset Management System (MAM).

Why do MAM Systems need facial recognition services?

First of all, you need to become familiar with MAM Systems. The main idea is that these systems help media companies or teams store and organize their media files (video, audio, image, video editor projects, etc.).

Most of the media files are video content. It can be the streaming of computer, poker, or sports games; news reports, films and series, ads, and much more. It is often important to know who is on the video for searching or analysis purposes. Hence, the editor or another specialist has to specify these people in the media file’s metadata. And facial recognition systems can help to automate this process and save their precious time.

Extension of MAM System

MAM System provides some capabilities to extend its functionality.

REST API — allows creating, updating, and deleting all resources entities.

Action — allows triggering a manual call from within GUI which calls a configured server. It can be triggered without context or from a photo, video, or folder. If Action is triggered from the context, entities IDs will be in the request payload.

Webhook — calls the configured server in response to an event (for example, an entity is created, updated, or deleted). It also provides entities IDs in the payload.

Technologies

You need to be familiar with Cloud Computing and Serverless Architecture.

Amazon Web Services — a vendor that provides on-demand cloud computing platforms.
Amazon Rekognition is an AI service that offers pre-trained and customizable computer vision (CV) capabilities for image and video analysis.
Lambda Functions and API Gateway is a good match for integration with MAM System’s Actions and Webhooks.
Simple Storage Service (S3) is connected to MAM System’s storage and used for storing photos of faces and videos for facial recognition.
DynamoDB is a NoSQL database for storing information about analyzed photos of faces.
Serverless Framework — powerful and flexible framework that allows deploying infrastructure and code in one command. It supports such vendors as Amazon Web Services, IBM Cloud, Google Cloud Platform, Microsoft Azure, and much more.
Node.js — JavaScript runtime for running code in Lambda functions.
TypeScript — the extended version of JavaScript with strict typing used for code in Lambda functions.

Project requirements

Workflow requirements

Users trigger Create Name Action and fill in the name of the face (let’s use John Doe). Then folder with the name of the face will be created inside the Faces folder. Users will upload photos of the face to the folder.

/ <- root folder
  Faces <- folder for all faces
    John Doe <- folder of one face
      John Doe 1.jpg <- photo of the face
      John Doe 2.jpg <- photo of the face
      John Doe 3.jpg <- photo of the face

After creating all names and adding photos to them, users trigger Face Analysis Action. It sends all photos from S3 to Amazon Rekognition Service for indexing, waits until it finishes, and then saves the result to DynamoDB:

{
  faceId, <- Face Id from Amazon Rekognition
  photoId, <- Id of the photo, for example John Doe 1.jpg
  folderId,  <- Id of John Doe folder
}

Now everything is ready for recognition. Users choose a video and trigger Facial Recognition Action. The video is sent to Amazon Rekognition Service. When the process is finished, the service returns the list of Face Ids. Then unique names from this list are added to the video.

If users made a mistake in the name of the face, they can choose a folder, trigger Update Face Name Action and provide a new name. It will update the title of the folder and name on any video where it was recognized automatically.

Also, if users accidentally uploaded the wrong photo to the folder of one face, they can remove this photo and a webhook will be triggered to remove it from Amazon Rekognition Service and DynamoDB.

Technical requirements

The major technical requirement is that all photos and videos should be stored in S3 buckets in the same region where the Serverless service is deployed and Amazon Rekognition Service is used.

Architecture

Create Name

Face Analysis

First of all, we need to create a folder in Amazon Recognition Service using the Create Collection method. Then we can add faces there with the Index Faces method.

Facial Recognition

To start the recognition process we should use the Start Face Search method. Then we can check the result using the Get Face Search method.

Update Face Name

Remove Photo

To delete a face from the Amazon Rekognition collection we should use the Delete Face method.

Conclusion

You have now learned what is MAM Systems, why they need AI services, and how to integrate Amazon Rekognition Service to the third-party Media Asset Management System. You can adapt this architecture to your application.

Feel free to contact me through email, Instagram, Linked In, or Facebook.