Python Based PDF Generator Lambda Function with S3 Event Trigger

Kenechukwu Nnamani
Analytics Vidhya
Published in
4 min readApr 9, 2020
schematic diagram

Introduction

S3 is great for storing objects. Any kind of object such as images, documents and binaries can be stored on S3. S3 offers unlimited storage space and offers extremely durable, highly available, and infinitely scalable data storage infrastructure at very low costs although each object does not need to exceed 5TB. Here is S3 FAQ.

Often times, businesses may want to generate a PDF format of a file which it saves in the S3 bucket. This article provides a step by step guide of setting up and creating such services using Lambda function with sample code written in python.

Architecture Explanation

The S3 bucket named kene-test is going to have two prefixes (folders) which are incoming and incoming_PDF. Just as the names suggest, the incoming will be the folder where we push the S3 objects while the incoming_PDF will serve as a folder where we keep the generated pdf format of the files in the incoming folder.

The Lambda function will be triggered by the “All object create event” S3 event. After converting the file to PDF, it is saved in the Lambda /tmp folder before getting pushed into the incoming_PDF folder. It is important to note that /tmp is a 512MB directory storage path and the only writeable path in Lambda function.

S3 Bucket

Go to the AWS Console and create an S3 bucket (in this case kene-test) with two folders as shown below.

kene-test S3 Bucket

Lambda Function

Add the below policy to the Lambda function role. This will enable the Lambda function to be able to perform S3 Get and Put actions.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject"
],
"Resource": "arn:aws:s3:::*/*"
}
]
}

Click on the Add Trigger button on the Lambda function interface to configure the Lambda function to be triggered whenever an object is pushed into the incoming folder of the kene-test S3 bucket.

Add Trigger

Type and select S3 in the search bar. Fill the form according to the image shown below. Click save when you are done.

Now your Lambda function is configured to be triggered whenever an object is created in the incoming folder of kene-test S3 bucket.

Lambda Function Python Code

There are three ways that you can add your code to the Lambda function and they are:

  1. Edit code inline
  2. Upload a .zip of the code and the code dependencies
  3. Upload a file from S3

In this short exercise, we will upload the code and the code packages as a .zip file.

The sample code is shown. Note that boto3 and fpdf need to be included in the .zip file in case you want to create the .zip file by yourself.

You can download the full .zip from my Github repo.

git clone https://github.com/Kenec/Lambda-PDF-Generator.git

The next step is to edit the Handler text field to point to our own. Change it as shown in the image below so that Lambda can execute the code.

Testing

To test the setup, upload a file to the incoming folder in S3. This will invoke the function, generate the PDF format of the file you uploaded and save it to incoming_PDF folder.

Conclusion

This simple setup can be part of another complex architecture. It can be tweaked to perform other tasks such as notify the team on the Slack channel when such operation is completed.

Thank you for reading.

--

--