Build a Smart Photo Organizer with AWS Rekognition

6 min readJun 26, 2024

This is part 1 of machine learning on AWS series, which will cover various AWS ML services using a hands on project based approach.

Pre-Requisites

Basic python knowledge
An AWS account (If you don’t have one go to aws.amazon.com and sign up for a free account)
Basic AWS knowledge (Optional but recommended)

Project Overview

We will be using 3 main AWS services for this project:

AWS Rekognition is a powerful image and video analysis service that uses deep learning to identify objects, people, text, scenes, and activities, and to detect any inappropriate content.
Amazon S3 (Simple Storage Service) is a scalable, high-speed, web-based cloud storage service designed to store and retrieve any amount of data from anywhere on the internet.
AWS Lambda is a serverless compute service that automatically runs your code in response to events, scaling seamlessly from a few requests per day to thousands per second.

In this project we will use the abilities of Rekognition service to auto tag photos uploaded in S3.

Create Roles for Access

Access for different services in AWS is managed through roles. Since our core logic will live inside the Lambda serverless function, we need to create a role which gives it access to Rekognition, S3 and CloudWatch.

Log in to AWS management console and navigate to IAM Dashboard
Click on “Roles” in the left Nav

Click on “Create Role”

Under “Trusted entity type” choose “AWS Service”, under the “Use Case” dropdown select “Lambda” and click Next
Filter and select “S3FullAccess”, “CloudWatchLogsFullAccess” and “AmazonRekognitionFullAccess” permission policies

Click Next and assign the role a name, “PhotoOrganizerRole”
Finally finish the flow by clicking “Create Role”
(The permissions applied for this tutorial are overly permissive but for an actual production setup we would only restrict permissions to the abilities actually needed by the Lambda function)

Create S3 Bucket

Navigate to S3 in AWS Management Console and click “Create Bucket”

Give it a unique name such as “smart-photo-organizer-bucket” (S3 bucket names are globally unique) and click Create (leave the other options at default values)
After the bucket is created go to permissions tab and paste in the following permissions, these will allow Rekognition and Lambda to read our images from the bucket

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "rekognition.amazonaws.com"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::smart-photo-organizer-bucket/*"
        },
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "lambda.amazonaws.com"
            },
            "Action": [
                "s3:GetObject",
                "s3:GetObjectVersion"
            ],
            "Resource": "arn:aws:s3:::smart-photo-organizer-bucket/*"
        }
    ]
}

Create Lambda Function

Next we will create the Lambda function.

Navigate to AWS Lambda in the management console and click “Create Function”

Select “Author from scratch”.
Enter a function name, “SmartPhotoOrganizer”.
Choose “Python 3.x” as the runtime (or your preferred runtime).
Under “Permissions”, select “Use an existing role”.
Choose the role you created earlier that has full Rekognition and S3 access.

Click create function and paste the following code in the IDE that shows up (the code is heavily commented to explain what each section is doing)

import json
import boto3
import logging
import urllib.parse

# Set up logging to capture logs for debugging and monitoring
logger = logging.getLogger()
logger.setLevel(logging.INFO)

# Initialize the S3 and Rekognition clients using boto3
s3_client = boto3.client('s3')
rekognition_client = boto3.client('rekognition')

def lambda_handler(event, context):
    try:
        # Log the received event for debugging purposes
        logger.info(f"Received event: {json.dumps(event)}")

        # Extract the bucket name and object key from the event
        bucket = event['Records'][0]['s3']['bucket']['name']
        key = event['Records'][0]['s3']['object']['key']
        logger.info(f"Bucket: {bucket}, Key: {key}")

        # URL-encode the object key to handle spaces and special characters
        encoded_key = urllib.parse.quote(key)
        logger.info(f"Encoded Key: {encoded_key}")

        # Check if the file is a JPEG or PNG by examining the file extension
        if not (encoded_key.lower().endswith('.jpg') or encoded_key.lower().endswith('.jpeg') or encoded_key.lower().endswith('.png')):
            logger.error(f"Invalid image format for file: {encoded_key}")
            return {
                'statusCode': 400,
                'body': json.dumps('Invalid image format. Only JPEG and PNG are supported.')
            }

        # Get the object tags from S3
        tagging_response = s3_client.get_object_tagging(Bucket=bucket, Key=encoded_key)
        existing_tags = {tag['Key']: tag['Value'] for tag in tagging_response['TagSet']}
        logger.info(f"Existing tags for {encoded_key}: {existing_tags}")

        # Check if the object has already been processed
        if 'Processed' in existing_tags:
            logger.info(f"File {encoded_key} is already processed and tagged. Skipping...")
            return {
                'statusCode': 200,
                'body': json.dumps('Image already processed.')
            }

        # Call Rekognition to detect labels in the image
        response = rekognition_client.detect_labels(
            Image={
                'S3Object': {
                    'Bucket': bucket,
                    'Name': encoded_key
                }
            },
            MaxLabels=10  # Limit the number of labels to detect
        )

        # Extract the detected labels from the Rekognition response
        labels = response['Labels']
        label_names = [label['Name'] for label in labels]
        logger.info(f"Detected labels for {encoded_key}: {label_names}")

        # Create tags from the detected labels and add a 'Processed' tag
        tags = [{'Key': f'Label{i + 1}', 'Value': label['Name']} for i, label in enumerate(labels)]
        tags.append({'Key': 'Processed', 'Value': 'True'})

        # Ensure the total number of tags does not exceed 10
        if len(tags) > 10:
            tags = tags[:9]  # Limit to 9 tags to accommodate the 'Processed' tag

        # Add the tags to the S3 object
        s3_client.put_object_tagging(
            Bucket=bucket,
            Key=encoded_key,
            Tagging={
                'TagSet': tags
            }
        )

        # Return a success response
        return {
            'statusCode': 200,
            'body': json.dumps('Image processed and tagged successfully!')
        }

    # Handle exceptions for NoSuchKey, meaning the object was not found in S3
    except s3_client.exceptions.NoSuchKey as e:
        logger.error(f"NoSuchKeyException: {str(e)}")
        return {
            'statusCode': 404,
            'body': json.dumps('S3 object not found.')
        }

    # Handle exceptions for invalid image formats recognized by Rekognition
    except rekognition_client.exceptions.InvalidImageFormatException as e:
        logger.error(f"InvalidImageFormatException: {str(e)}")
        return {
            'statusCode': 400,
            'body': json.dumps('Invalid image format.')
        }

    # Handle exceptions for invalid S3 object metadata recognized by Rekognition
    except rekognition_client.exceptions.InvalidS3ObjectException as e:
        logger.error(f"InvalidS3ObjectException: {str(e)}")
        return {
            'statusCode': 400,
            'body': json.dumps('Invalid S3 object. Check object key, region, and/or access permissions.')
        }

    # Handle any other exceptions that may occur
    except Exception as e:
        logger.error(f"An error occurred: {str(e)}")
        return {
            'statusCode': 500,
            'body': json.dumps('An internal error occurred.')
        }

Click on “Deploy”

Create Cloud Watch Log Group

Navigate to CloudWatch Logs in AWS Management Console
Click on “Logs” in the left-hand menu.

Click on “Actions” and select “Create log group” ( /aws/lambda/SmartPhotoOrganizer does not already exist)

Enter /aws/lambda/SmartPhotoOrganizer as the log group name.
Click “Create log group”

Connect S3 Bucket to Lambda

Navigate to Configuration tab for your Lambda function and click Triggers
Click “Add Trigger”

Select “S3” as the trigger.
Configure the trigger by selecting the bucket you created earlier.
Choose the event type “All object create events”.
Click “Add” to set up the trigger.

Test the Photo Organizer

Navigate to your S3 bucket and upload a jpg or a png image file
You should be able to look at the logs under /aws/lambda/SmartPhotoOrganizer log group in Cloud Watch log group, ensure there are no errors and labels from Rekognition are being fetched
Verify in S3 tags have been applied to your image.
This is just one way photos can be organized, another option would be moving images to other appropriate buckets based on the labels returned, for example auto sorting cat and dog images in separate buckets.

In the next part of this series we build a Receipt Reader using Textract and Comprehend

Title Background: “200520” by takawohttp://openprocessing.org/sketch/857874License CreativeCommons Attribution NonCommercial ShareAlikehttps://creativecommons.org/licenses/by-nc-sa/3.0