A Step by Step Guide

Building a Simple AI Photo Analytics App on AWS

Using Amazon’s Rekognition, Lambda & Quicksight for near real-time Image Analytics

Andrew Gooday

Published in

The Startup

10 min readJun 17, 2020

Overview

We’re going to use some clever AWS features to build an AI powered image analytics app, providing machine learning derived insights on photos in near real time.

What’s more, we’re going to use ‘serverless’ components, meaning you should be able to do get this up and running on your images in a couple of hours or less!

Here’s what you’re aiming for — a dashboard that refreshes with some nice infographics to show you what objects and text are in your photos, immediately that you upload them.

Getting Started

First things first, you’ll need an AWS account, ideally with full admin access.
Please, use the SAME REGION for everything — S3, QuickSight, Athena, Lambda, Glue and make QuickSight is available in the region you’ve chosen.

There will be some small charges incurred — check AWS pricing details for your region. If you’re on the 1 year free tier, then unless you do something very wrong (in particular in the “Lambda” section below), you shouldn’t spend more than a couple of dollars maximum. (I didn’t)

What you will notice as we progress is that we’re:

NOT building, configuring, managing any servers including any databases.
That’s fine, because this is all going to be ‘serverless’.
NOT building, tuning, training any neural nets.
Instead, we’ll be using AWS Rekognition’s built in capabilities.

How it Works

The user (you!) puts 20–30 images into a specific S3 bucket (1.) This triggers an event that launches a lambda function for which I supply a bit of Python code (2.) This is code running in the cloud without you having to think about servers etc. so pretty useful. In fact, 20–30 Lambda functions will fire up, one to handle each image, last about 1 second, then disappear.

The Lambda python code calls Amazon Rekognition, which runs 2 algorithms — object detection and text detection on the image (3.) Because the image is sitting in S3, you don’t have to worry about pre-processing it. You can extend the Python code later to add in more Rekognition calls (face detection etc.) if you want.

The Lambda function stores the resulting label and text descriptions (4), one file each, into their corresponding ‘folders’ in S3. These output files are in JSON format, and ‘flattened’ so that you get some useful columns for analysis later. (Rekognition’s JSON is deeply nested — for example the ‘person’ label may have lots of ‘bounding boxes’ returned if there are numerous people in the image.)

So, here’s the clever bit step (5), we’ll manually run an AWS Glue crawler which will create the data catalogue and meta data needed so that both Athena and QuickSight can run analysis directly on the JSON files (from step 4) as if you have a database.

You only need to run the crawler once, but based on the data generated from your first upload, there’s enough info there for the “image” and “labels” tables to be constructed automatically. Again, there’s no database behind this analytics app, just your JSON files sitting in S3.

As we’ve got two tables, if you want to use data from both text and object detection, we’ll need to join them to create a ‘view’. This can be done easily in Athena with a simple command (6).

Finally, step (7), we’ll connect QuickSight to the “database” and “view” created in the previous step.

Setting Up Your S3 Buckets

Step 1

If you’re reading this article, you’ll hopefully be at least a little familiar with S3. S3 is serverless, infinitely scalable storage, where all of your unstructured data can sit, floating around in a fully resilient and secure data lake.

But for this tutorial, you’ll just be creating 2 buckets.

create a new empty bucket to store images in. I’d rather you had a specific bucket, used for no other purpose than storing images, as it will make it harder to get things wrong later. I’ve called mine “agooday-images-eu”.
create a new empty bucket for storing the image and text detection files and give each a separate ‘folder’ (they’re not really folders, but it’s easier to talk about them as such.) I’ve called mine “agooday-athena-test/labels”, “agooday-athena-test/text”. ** please do not put your output files in the same bucket as your source images **

Configuring Lambda & Rekognition

Steps 2, 3 & 4

Ok, let’s configure the Lambda function. Here’s the python code:

The only thing you need to do before we use this code is to update line 6 with your target bucket name — the name of the bucket where the images/ and text/ JSON will be stored.

Boto3 is AWS’s python SDK allowing you to control any AWS service. We’re using it to read/write files to S3 and to access Rekognition. Rekognition returns JSON in a Python Dictionary. The script saves this to the target bucket into a format that AWS Glue likes.

Log in to the AWS console and create a new Lambda function. Select “from scratch”, “python3.7” and “create a new role with basic Lambda permissions”

Create a Lambda Function — Author from Scratch

Now, go to permissions and click on the role that’s just been automatically created, in this example ‘s3lambda-role-4ywsmx9h’…

You should be now at the IAM console, where you’ll assign further permissions. Click “Attach Permissions” and search for each of the policies your missing from below. Click the checkbox and add (if you click on the policy you’ll just head off looking at the policy details.)

You’ll need the CloudWatch role to see what’s happening and troubleshoot, S3 to read/write to your buckets, Rekognition — well, to recognise things!

Now we’re ready to paste the Python code into Lambda.

You’ll see I’ve updated the target bucket in line 6. Once that’s done, save and select + Add trigger.

This dialogue will configure S3 to send an event and launch the Lambda function. Please select the source bucket in which you’ll be putting your images — and only your images, nothing else unless you want to trigger the Lambda. Disable the trigger for now and click add. Later when the trigger is active, you’ll be able to see it in your S3 bucket, properties, events as an active notification.

You should now be here:

Time to test this is working. Step 1, select Test, navigate to s3-put (which if you get stuck, you can modify to help troubleshoot) and create a test event.

Run the test and you should see the following:

Don’t worry about the “Call Failed …” error.

Now enable the trigger, and click on your S3 bucket. This will open up the correct bucket (avoiding chance of error.) and upload a file directly.

Now go back into Lambda, Monitoring and scroll down to logs. Click on the most recent log:

And you should see something like this:

If you don’t see any logs, check your permissions. If you need to troubleshoot further, then you’ll have to update the test put with some valid details. I’ve an example here —

Navigate to your target bucket, and labels/ and check for a *.labels.json file.
There may also be a *.text.json file in your text/ directory. It depends on whether text was detected. Check the CloudWatch Lambda logs if in doubt as they will report whether labels or text found.

Well done! You’ve just configured an S3 trigger and Lambda functions to get object and text details, able to scale to handle any amount of photos (your finances allowing!) Note you may need to extend your Lambda default from the default of 3s if you see Lambda timeouts in the CloudWatch logs.

Cataloguing Your AI Data with Glue

Step 5

It gets easier from here — Firstly, upload 20–30 photos into your S3 source bucket and check that the corresponding labels and text JSON files are generated in your target bucket. Not every image will have a text JSON file, but most if not all should have a labels file. (The Lambda function won’t write empty JSON if nothing is detected.)

You’ve done this so that we can build a table with a good number of columns in. Some images will have lots of labels, and nested parent categories.

Navigate to Glue and “Add Crawler”. I’ve expanded the custom classifier tab below to show it’s there, and to point out I wasted a lot of time here trying to get Glue to work with the JSON as it’s sent by Rekognition. Fortunately, your shiny new Lambda function flattens and formats the JSON, so nothing other than naming your crawler needed here — ignore the classifiers.

Creating a Crawler — No need for Custom Classifiers!

On the next screen select “Data stores”, then “next”. Now add your target bucket as your data store.

Specify the Data Source for your Analytics

On the next screen, select “no” to add another data store, then “Next”.

Create an IAM role

Then “Run on Demand”, “Next”

On “Configure the crawler’s output”, select Add Database and give it a name “image-db”, “create” and then “Next”, then “Finish”.

It’ll take about 40–50 seconds, and when it’s finished you will have a database “image-db” with two tables, “labels” and “text” in.

Have a look at the “labels” table. And yes, you can now query this with SQL.

A Table Created from your Image Data, automatically by Glue

Go back to the list of tables, and select Action, “View data”, “Preview data”.

This will take you to Athena, which is where we need to be next…

Creating a View in Athena

Step 6

Nearly there. You should now be looking at a SQL query dialogue:

Querying your Image’s JSON files Directly, via Athena’s SQL

So we just now need to create a view so as to join both labels and text table together.

I’m being selective below, but you could select all columns if you wanted.

Run this sql, and you should now have a view that looks like this —

That’s all of the data catalogue work done — every time you upload an image, the AI derived text and image label data will be available immediately in your database.

That’s cool — but it would be even better if we could present in a dashboard, which is what we’ll do next.

Connecting It All Together — Adding QuickSight

Step 7

Now for the fun stuff — let’s hook up QuickSight and take a look at our image data. Create a QuickSight account, then navigate to “Create a Data Set”.

Ensure you use the database name we created earlier.

Then select “myview” on the next screen.

Don’t use SPICE, query data directly for now. This ensures that as you add images, you’ll be able to access the image object and text data straight away.

Then click “Visualize”. You now need to check permissions.

Select “Manage QuickSight” from your account icon, top right, security & permissions. Click on Add or Remove and ensure IAM, Amazon S3 and Amazon Athena are checked.

Most importantly, click on Details and ensure your target bucket is checked (the bucket with the JSON files in.)

That’s it — we’re done!

Upload Images & see Rekognition Info Immediately

Take a look at this video — I start with empty S3 image and data buckets, so QuickSight shows ‘no data’. I then drop the images back into the source folder. This triggers Lambda, and the logs show the label and text capture. The target folders are populated and a refresh of QuickSight shows the data present and ready for querying.

What Next?

You could try adding custom labels — to get AWS Rekognition to build on what it can already identify (transfer learning without the hassle.) Or add face recognition, content moderation. You could use Lambda to send images with high text content to Comprehend for further insight.

Keep an eye on your billing data via the billing console and don’t go crazy with your free tier limits, for example rekognition’s 5000 images per month.

It’s more work, but you could start working with videos and streams — check out what’s possible here