AWS Rekognition for Image Labeling

6 min readMar 16, 2020

The Rekognition API is an Amazon Web Services (AWS) microservice designed for computer vision applications. As part of AWS’s Machine Learning suite, Rekognition provides scalable, on-demand image and video AI processing. Not only is Rekognition the go-to for architecture, but also it can help data scientists and machine learning engineers build better models in less time.

When building convolutional neural networks (CNN) for image detection, ensuring training images are consistent with the desired label is the simplest way to increase model accuracy. Just as a Japanese character masquerading as the 27th letter would confuse a child learning the English alphabet, it’s easy to see how training a neural network to recognize apples, while showing it the occasional banana, would negatively affect model performance.

AWS Rekognition Features, Source: Amazon

Training CNNs for image recognition requires at least 800 high-quality photos per class. For example, if you’re building a model to distinguish between horses and cattle, plan on gathering about 3,000 images: 20% will be useless or redundant and 30% will comprise the null class (these can be images of literally anything but horses or cattle). Compared with numerical or categorical data, which can be screened for inconsistent types and null values with minimal code, imagery’s inherent intricacy creates more preprocessing obstacles:

How do you know all training images are representative of your target label?
Does your training imagery contain multiple instances of the target label, extraneous objects or other noise likely to confuse your model?
Are there redundant images with unique filenames in your directory?

Traditionally, these questions could not be confidently answered without a human manually reviewing each and every image to confirm the advertised “apple” is in fact the fruit and not the technology giant. Weeks had to budgeted for this time-intensive, repetitive work if not hired out via platforms like AWS Mechanical Turk, but even delegation takes time. As an AWS Certified Solutions Architect Associate & Cloud Practitioner, I prefer to work smarter by leveraging their tech wherever possible. So in lieu of spending the weekend tediously clicking through thousands of photos, I experimented with enlisting AWS Rekognition for the task of verifying image labels — and it worked flawlessly.

The remainder of this posts provides the necessary code & instructions to automate image labeling. The environment’s requirements include: an AWS account, training images located in S3 and Boto3 — AWS’s python SDK.

Environment Requirements, Source: Russell W. Myers

Step 1 — Instantiate S3 and Rekognition Boto3 Clients:

Install Boto3 and separate images by label in an S3 bucket directory. In the code block below, we first instantiate s3_client with Boto3 and set bucket_name to the S3 bucket containing the training imagery. If the images are in a folder within the bucket, we must also set the prefix variable to the bucket’s specific file path.

import boto3s3_client = boto3.client('s3')
bucket_name = 'your-bucket-name'
prefix = '/images-directory-path/'

rek_client = boto3.client(
    'rekognition',
    aws_access_key_id = 'your_key_id',
    aws_secret_access_key = 'your_secret_key',
    region_name = 'us-east-1'
  )

Step 2 — Calibrate Rekognition:

Next we build a list containing those labels that Rekognition is associating with your desired label. For example, an image of a feral pig may return: hog, pig, and wild boar. In order to avoid false negatives during label automation, we want to include all possible labels above the desired Confidence (this parameter accompanies every returned label). This is accomplished by selecting 3–5 images that exemplify the target label — ideally as isolated instances excluding any other extraneous objects. Copy these image paths to the test_images list:

test_images = ['image-1-path, image-2-path, image-3-path']

The code block below calls Rekognition on each of our test images, captures up to 10 labels for each, and appends labels with a Confidence > 85% to animal_list. Finally, duplicates are removed by converting test_images to the set test_labels .

## calibrate Rekognition's response to target class

animal_list = list()
test_labels = {}for img in test_images:
    response = rek_client.detect_labels(
        Image={
            'S3Object': {
                'Bucket': bucket_name,
                'Name': img,
            }
        },
        MaxLabels = 10,
    )
    for label in response['Labels']:
        if label['Confidence'] > 85:
            animal_list.append(label['Name'])

## create a set of unique image labels from our test imagestest_labels = set(animal_list)

Step 3.1 — Run Label Verification with Rekognition:

With labels calibrated, we’re going to recycle a modified version of the code above on our training imagery. The primary difference is using the get_paginator object, which is required to group files from an S3 directory because S3’s flattened structure prefers the prefix abstraction to traditional directories.

## continued from code block above

keyString_list = list()
bad_pics = 0## create botocore.paginate.PageIterator object from S3 images

paginator = s3_client.get_paginator('list_objects_v2')
result = paginator.paginate(Bucket = bucket_name, Prefix = prefix)

## unpack the image file keystrings from the paginator results
for page in result:
    if 'Contents' in page:
        for key in page['Contents']:
            keyString = key['Key']
            keyString_list.append(keyString)

            ## call Rekognition with the file's keyString
            try:
                rek_response = rek_client.detect_labels(
                    Image={
                        'S3Object': {
                            'Bucket': bucket_name,
                            'Name': keyString,
                        }
                    },
                    MaxLabels = 10,
                )
                ## append response labels > 85% confidence
                labels_list = []
                for label in rek_response['Labels']:
                    if label['Confidence'] > 85:
                        labels_list.append(label['Name'])

Step 3.2 — Customize Label Verification with Rekognition:

In the last segment, I added additional customization to ensure my training imagery was hyper-focused on class labels. Rekognition can filter your training set, simultaneously confirming label presence while excluding specified noise! In this example, the powerful functionality to remove any photos containing people requires nothing more than adding or ('Person' in test_labels) to the conditional statement highlighted below. To avert any gnashing of teeth, I would advise including user confirmation before deleting the files (although I have yet to find a photo mistakenly on deck for deletion).

## compare labels_list to test_labels and remove false negatives 
## note: indentations continued from above                labels_list = set(labels_list)
                if (not labels_list.intersection(test_labels)) or ('Person' in test_labels):
                    s3_client.delete_object(Bucket = bucket_name, Key = keyString)
                    bad_pics += 1            except:
                print('Bad image:', keyString)

print('{} images processed'.format(len(keyString_list)))
print('Deleted {} images.'.format(bad_pics))

Pink Elephants & Whitetail Deer:

This all begs the obvious question: if AWS Rekognition has perfected many computer vision problems, why build your own models at all? The answer is specialization: while Rekognition is a 90% solution for most image and video analysis, specific use cases require custom models.

Wildlife species identification is one example where Rekognition’s fidelity is limited: testing Rekognition with images of Whitetail Deer returns “impala”, an aesthetically similar animal, but from a different taxonomy family and indigenous to another continent. Considering that dozens of Whitetail Deer subspecies are also spread throughout the Americas, we can conclude the hunting guide profession is currently outside disruption’s path.

While Rekognition is broadly accurate, more specific applications may require a model with deeper discernment. However, if the goal is a fast, accurate, and inexpensive alternative to manually cleaning training imagery, Rekognition is the solution.

Thanks for reading and reach out with any questions on LinkedIn. The full codebase is available at CNN-Species-Identification repo on Github.

AWS Rekognition for Image Labeling

Written by Russell W. Myers