Building a PDF Generator on AWS Lambda with Python3 and wkhtmltopdf

Update 07/18/2019: Source Code:

UPDATE 06/28/2019: For anyone attempting to follow this, please read a follow-up post about a font bug on the generated PDFs:


The purpose of this article is to show you how I accomplished creating the API so that you will see how easy it is to create your own serverless APIs. When this project is complete, you will have an API endpoint in which you can POST a JSON object to, and receive a link to a PDF on Amazon S3. The JSON object looks like this:

"filename": "sample.pdf",
"html": "<html><head></head><body><h1>It works! This is the default PDF.</h1></body></html>"


  • Python3
  • Aws CLI installed and configured
  • Serverless installed


Note: ‘sls’ is short for serverless and can be used interchangeably.

sls create --template aws-python3

This command will bootstrap a Python3 Lambda set up for you to work from.


Download version 0.12.4 here:

Once you have extracted this tar file, copy the binary wkhtmltopdf to the binary folder in your project.


More information about wkhtmltopdf can be found on their website:

Python3 Dependencies

In our project folder install the python plugin requirements module for Serverless.

sls plugin install -n serverless-python-requirements

Now, in your serverless.yml file, you need to add a custom section in the yaml:

custom: pythonRequirements: dockerizePip: true

Once the serverless plugin requirements is installed, you can add a requirements.txt file to your project and it will be automatically installed on lambda when you deploy.

Your requirements.txt for this project only needs to have pdfkit.



For any issues with this module, checkout the repository issue:
Serverless Python Requirements

Ready, Set, Code


For this serverless.yml configuration, you’ll need to create a file called config.yml. This will store the S3 bucket name. The serverless.yml will reference the config.yml to set up the correct bucket for your project.

Contents of config.yml

BucketName: 'your-s3-bucket-name'

Here is a high-level overview of a serverless.yml file:

service: pdf-services # name or reference to our project provider: # It is possible to use Azure, GCloud, or AWSfunctions: # Array of functions to deploy as Lambdasresources: # S3 buckets, DynamoDB tables, and other possible resources to createplugins: # Plugins for Serverlesscustom: # Custom variables used by you or plugins during setup and deployment

Our serverless configuration will do a few things for us when we deploy:

  1. Create an S3 bucket called pdf-service-bucket to store our PDFs
  2. Create a function that will create the PDFs
  3. Give our function access to the S3 bucket
  4. Setup an API endpoint for our Lambda function at:

Here is the full serverless.yml configuration. I’ve added a couple important comments in the code.

service: pdf-service
name: aws
runtime: python3.7
# Set environment variable for the S3 bucket
S3_BUCKET_NAME: ${file(./config.yml):BucketName}
# Gives our functions full read and write access to the S3 Bucket
- Effect: "Allow"
- "s3:*"
- arn:aws:s3:::${file(./config.yml):BucketName}
- arn:aws:s3:::${file(./config.yml):BucketName}/*
handler: handler.generate_pdf
- http:
path: new-pdf
method: post
cors: true
# Creates an S3 bucket in our AWS account
Type: AWS::S3::Bucket
BucketName: ${file(./config.yml):BucketName}
dockerizePip: true
- serverless-python-requirements

  • Context contains environment variables and system information.
  • Event contains request data that is sent to the lambda function.

In this project, we will send our function generate_pdf a filename and HTML, and it will return the URI for a PDF it creates.

import json
import pdfkit
import boto3
import os
client = boto3.client('s3')
# Get the bucket name environment variables to use in our code
S3_BUCKET_NAME = os.environ.get('S3_BUCKET_NAME')
def generate_pdf(event, context):

# Defaults
key = 'deafult-filename.pdf'
html = "<html><head></head><body><h1>It works! This is the default PDF.</h1></body></html>"

# Decode json and set values for our pdf
if 'body' in event:
data = json.loads(event['body'])
key = data['filename']
html = data['html']
# Set file path to save pdf on lambda first (temporary storage)
filepath = '/tmp/{key}'.format(key=key)

# Create PDF
config = pdfkit.configuration(wkhtmltopdf="binary/wkhtmltopdf")
pdfkit.from_string(html, filepath, configuration=config, options={})
# Upload to S3 Bucket
r = client.put_object(
Body=open(filepath, 'rb'),

# Format the PDF URI
object_url = "https://{0}{1}".format(S3_BUCKET_NAME, key)
# Response with result
response = {
"headers": {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Credentials": True,
"statusCode": 200,
"body": object_url
return response


sls deploy

After you run deploy, Serverless will create everything for you in AWS. You will get HTTP POST endpoint that you will use to generate PDFs. The endpoint will look something like this:

You can use curl to test your function. The following curl command posts a JSON object to the lambda endpoint. The JSON object contains a filename and some HTML to turn into a PDF.

curl -d '{"filename":"my-sample-filename.pdf", "html":"<html><head></head><body><h1>Custom HTML -> Posted From CURL as {JSON}</h1></body></html>"}' -H "Content-Type: application/json" -X POST REPLACE-WITH-YOUR-ENDPOINT

Note: Replace “REPLACE-WITH-YOUR-ENDPOINT” with the endpoint you receive from Serverless.

After running this command you should receive the URI to your generated PDF.


Thanks for reading!

Next Steps


Originally published at