One recipe for Content Moderation in videos using AWS

This is a simple recipe of how to use S3, Rekognition, SNS, SQS and Lambda Functions in nodejs.

Published in

Hexacta Engineering

3 min readJul 17, 2020

In this case, we must detect and moderate content from videos that were uploaded in a bucket in S3. So, how to do it without changing our flow? (or just by a little)

We have built this architecture:

Which services are we going to use?

S3: service that provides object storage, that is cloud storage.
Rekognition: computer vision platform, that provides image and video analysis, also recognition of many labels. It offers “content moderation” capabilities.
Lambda: platform that provides event-driven execution functions, better known as serverless.
SNS: message delivery platform. It only acts as a message-bus, without persisting the message.
SQS: message queue service. It could be used to decouple our service, here we have used it to obtain the messages from SNS.

Which are the labels that Rekognition detect?

| TopLevel Category  |         Second-Level Category         |
|--------------------|---------------------------------------|
| Explicit Nudity    |                                       |
|                    | Nudity                                |
|                    | Graphic Male Nudity                   |
|                    | Graphic Female Nudity                 |
|                    | Sexual Activity                       |
|                    | Illustrated Nudity Or Sexual Activity |
|                    | Adult Toys                            |
|------------------------------------------------------------|
| Sugestive          |                                       |
|                    | Female Swimwear Or Underwear          |
|                    | Male Swimwear Or Underwear            |
|                    | Partial Nudity                        |
|                    | Revealing Clothes                     |
|------------------------------------------------------------|
| Violence           |                                       |
|                    | Graphic Violence Or Gore              |
|                    | Physical Violence                     |
|                    | Weapon Violence                       |
|                    | Weapons                               |
|                    | Self Injury  
|----------------------------------------------------------- |
| Visually Disturbing|                                       |
|                    | Emaciated Bodies                      |
|                    | Corpses                               |
|                    | Hanging                               |
|------------------------------------------------------------|

The steps are:

Create a SNS to receive a notification from Rekognition when it finishes its job.
Create a SQS to receive the notifications the SNS will output .We need a SQS because there is no way to read the messages created by the SNS. whatever you want to do.
Create “StartLabelDetection” Lambda. It will invoke StartLabelDetection from Rekognition:

In the example, we set the “MinConfidence” at 50, but in production we found out that only results with more than 80 are accurate enought.
We added a trigger for this lambda, “everytime a video is uploaded at path /video/ in S3, execute it”.

Rekognition “startContentModeration”, will notify to SNS when it finishes the inference.

4. Create “GetLabels” Lambda. It will receive an event from SNS when rekognition has finished the inference.

We are saving the labels into an S3, but as we said we could update some column in a database with jobId as a key.

5. Between these steps we have created some roles & polices to allow comunication between them.

A tiny answer from this approach:

Next steps…

In the following iteration, instead of upload labels.json in a S3 bucket, we could save the information in a database. Like setting a suspicious boolean flag and the moderation labels.
Develop some CloudFoundation script to automate all creation steps, also AIM roles and policies, the awkward part.
Test out which value of confidence is the most suited for our needs

References

Detecting Unsafe Stored Videos

Amazon Rekognition Video unsafe content detection in stored videos is an asynchronous operation. To start detecting…

docs.aws.amazon.com

Tutorial: Creating an Amazon Rekognition Lambda Function

This tutorial shows how to get the results of a video analysis operation for label detection by using a Java Lambda…