Infrastructure as Code with Serverless Framework in AWS

Suminda Niroshan
9 min readAug 3, 2019

--

Figure 01 — Serverless Framework Deployment

When developing applications in cloud, it’s less worrying about the underlying infrastructure as it is managed for us. Creating a few microservices and getting them up and running in cloud is really easy and seamless.

When we build cloud applications, we lose some aspect of control. Having the source code of microservices in a source control is not enough to get the overall system up and running in a new environment. The code is only part of the overall system. The infrastructure needs to be built manually such as creating runtime environments to run the code and an API Gateway to access it.

What’s the solution for this? Infrastructure as Code! Well as the name implies, using this method, it’s possible to represent the overall cloud system as Code.

There are several Frameworks/Tools for this,

  • Serverless Framework
  • AWS CloudFormation
  • AWS The Serverless Application Model (SAM)
  • ZAPPA
  • Claudia.js
  • Terraform

In this guide “Serverless Framework” will be used as it is cloud agnostic and supports many cloud platforms such as AWS, Azure and Google Cloud. You can find a comparison with other frameworks from this link.

I will provide a walkthrough on configuring a cloud system in Serverless Framework using several AWS services which I have built in a previous post.

Prerequisites

You need to have an AWS account and some basic knowledge working with AWS services. Following AWS services will be utilised throughout this guide.

  • Lamda Service
  • Textract Service
  • Simple Notification Service
  • Simple Storage Service
  • Identity Access Management Service

Make sure to have the following installed,

You will learn

To configure all of the services above in Serverless Framework as infrastructure as code.

Serverless Framework

Serverless Framework is open source. It’s written using NodeJS and initially developed for building applications on AWS Platform and now it supports Azure, Google Cloud, Oracle Cloud and more.

It handles most of the boilerplate code when it comes to generating infrastructure as code in each respective platform. It has great community support. It currently supports 8 different cloud providers with the same developer experience.

You can find more info here.

Demo Application

Figure 02 — Demo AWS Textract System

A demo application will be deployed using Serverless Framework (See figure 02).

The basic functionality of the demo application will be really simple. Whenever a PDF document is uploaded to the S3 bucket, a Lambda function will be triggered and it will start a text extraction processing job in AWS Textract service. Once the AWS Textract completes the job, it will send a notification to the AWS Simple Notification Service which will trigger another Lambda function. The triggered Lambda from AWS SNS Service will get the text extraction job result from the payload and write the results to a text file in the S3 bucket with the same name as the PDF.

AWS SDK for Python boto3 is added as a Lambda layer for both Lambdas.

To summarize, when we upload a PDF document to the S3 bucket, it should output a text file with the same name with extracted text content.

The demo application consists of following AWS services which we need to configure in infrastructure as code in Serverless Framework.

  • Lamda Service
  • Textract Service
  • Simple Notification Service
  • Simple Storage Service

You can get the code from this Github Repo.

Getting Started

Serverless Framework is installed via Node package manager. Open a command prompt and execute the following command.

npm install -g serverless

Create a new project using following command.

serverless create --template aws-python3 --path Serverless-Framework-AWS-Textract

This will create a project with a sample Python3 based Lambda function. Go into the created “Serverless-Framework-AWS-Textract” directory from command line.

Install the Serverless Framework plugin “serverless-pseudo-parameters” with the following command. This plugin allows access to AWS environment variables easily instead of writing lines of CloudFormation expressions.

npm install serverless-pseudo-parameters

Next Install boto3 Python AWS SDK which is required for Lambda functions using following command.

pip install -t lib/boto3/python boto3

Creating The AWS Infrastructure as Code

Open the serverless.yml file and delete the default content. Following is the break down of main components of the serverless.yml that we’ll be using.

Service
The name of the AWS service (Project).

Custom
This section defines the custom variables that will be used throughout the configuration file.

Value for currentStage variable is taken from environment settings. If a stage such as “dev”, “qa” or “staging” is provided in the Serverless Framework CLI when deploying, that specific value will be used or else the default value “dev”. This variable will be used to tag global AWS resources such as S3 buckets and IAM roles.

S3 buckets are Globally defined resources. Therefor it should be uniquely named.
AWS Account Id where the resources will be deployed is used to make the s3BucketName unique. Note the usage of plugin “serverless-pseudo-parameters” when retrieving AWS Account Id — #{AWS::AccountId}.
To make things more clear, value of the currentStage is also appended to the s3BucketName.

Same goes for IAM role name which is Globally defined from within the root account.

Tagging resources with current stage allows us to deploy multiple stages in the same AWS account which will be demonstrated at the very end.

Please find more info on variables here.

Provider
Defines provider and related global settings.

In this demo, provider is AWS.
AWS managed policies are also defined in here which will be applied to all the Lambda functions globally defined by this service in order to give permissions to Lambda functions.
Specifies the runtime of the Lambda functions globally.
Please find more info on provider here.

Plugins

Specifies any plugins that will be used. Please find more info on plugins here.

Layers

In order to use boto3 Python SDK for AWS in Lambda functions. It needs to be packaged and deployed as a Lambda layer. This is the configuration where the Lambda layer will be created.
boto3Layer is the name of the layer within the Serverless configuration file. We can reference this layer using this name within the configuration to apply to Lambda functions.
path defines the directory path to the boto3 source which we have installed using PIP. (lib/boto3/python/<Code Files>)
Please find more info on layers here.

Note that it’s not a must to use layers in order to add packages. You can install a package in to the Lambda root directory using pip and can have Python import grab the package from the local directory. I have used Layers as a further showcase on capabilities on Serverless Framework.

Resources

This is where AWS resources are defined that will be created by Serverless Framework such as IAM roles, S3 Buckets or AWS Databases.

Please note that Lambda functions event resources will be automatically created once they are defined in events configuration section. No need to re-define in resources section.

In this demo, only an IAM role is created that needs access to AWS SNS assuming the role of AWS Textract service.

AssumeRolePolicyDocument section contains the same structure of a policy document that is configured in AWS. To allow Textract service to assume role.

ManagedPolicyArns section specifies the IAM managed policy to allow full access to AWS SNS.

Please find more info on resources here.

Functions

This is where we define Lambda functions pdfgetjobresult and pdfextractstart.

Note that both Lambdas have added boto3Layer as a Lambda layer by referencing as follows,
- { Ref: Boto3LayerLambdaLayer }
This is referencing resources using normalized names. Please go to this link to find out more.

Both Lambda configuration points to their respective code using handler property. pdfextractstart points to the function handle in file pdf_extract_start.py file and pdfgetjobresult points to the function handle in file pdf_extract_result.py file.

pdfextractstart Lambda gets triggered whenever a pdf is uploaded to the S3 bucket. Events section contains the configuration for this trigger.

pdfgetjobresult Lambda gets triggered whenever a notification is received for the AWS SNS Topic which is defined in it’s respective Events section.

Please note that pdfgetjobresult Lambda’s SNS Topic ARN value is saved as an environment variable called PDF_JOB_SNS_TOPIC_ARN in pdfextractstart Lambda as it’s needed to be passed as a parameter to Textract service from within the Python code. The ARN value of the PDF_JOB_SNS_TOPIC_ARN is dynamically generated using “serverless-pseudo-parameters” here as well.

Also the ARN value of the resource snsAccessRole created in resource section is passed as an environment variable called LAMBDA_ROLE_ARN in pdfextractstart Lambda as it’s needed to be passed as a parameter to Textract service. We have used the Serverless Framework plugin “serverless-pseudo-parameters” in order to grab this value.

Please find more info on functions here.

Following is the complete serverless.yml file. Replace the content of it with your file.

Creating Lambda Functions

Delete the existing handler.py default file and create following two functions in files pdf_extract_start.py and pdf_extract_result.py.

pdf_extract_start.py Code

Please note in pdf_extract_start.py code, the two environment variables that were saved in serverless.yml are used here (LAMBDA_ROLE_ARN and PDF_JOB_SNS_TOPIC_ARN) as parameters passed to the AWS Textract method along with name of PDF file uploaded and S3 bucket name.

This triggers a processing job in Textract and it will notify the SNS with ARN PDF_JOB_SNS_TOPIC_ARN using the role LAMBDA_ROLE_ARN that processing job of the PDF is completed.

pdf_extract_result.py Code

pdf_extract_result.py contains the function that gets triggered when AWS Textract sends a notification to SNS with ARN PDF_JOB_SNS_TOPIC_ARN. Once triggered, it will extract the PDF processing JobId from the SNS payload and get the extracted text content. After successfully retrieving Texract result, a new text file will be created with the same name as the uploaded PDF with the content of the text extracted in the same S3 bucket where the PDF document was uploaded.

Overall above two functions are responsible to pass the uploaded PDF to AWS Textract service and extract the content into a text file in the same S3 bucket.

Deploying The Application

Time for Serverless Framework’s Magic!

Execute the following command and Serverless Framework should start deploying resources as displayed below.

serverless deploy

If needed, You can remove the deployed resource stack any time by simply executing the following command.

serverless remove

Please find more useful commands from here.

Verify AWS Resource Generation

After deployment, go to your AWS console and you should see AWS resources specified in serverless.yml created as displayed below.

S3 Bucket Generated from Serverless Framework
Two Lambda Functions Generated from Serverless Framework
IAM Role allowing Full SNS Access Generated from Serverless Framework
AWS SNS Topic Generated from Serverless Framework
AWS CloudFormation Stack Generated from Serverless Framework

If these resources are generated you can go ahead with testing below.

If there are any deployment errors, you can investigate by exploring the CloudFormation stack events.

Testing PDF Text Extraction

Go to the S3 bucket and upload a PDF file. You can get a sample PDF file from here. After about 1 minute a text file with the same name as the PDF will be generated. This text file contains the text result extracted from the PDF.

Deploying Multiple Stages in the same AWS Account

Typically when developing software, it’s necessary to have stages like Dev, Staging, QA and Production. With Serverless Framework it’s even simpler deploying into different stages assuming you have configured the environment with dynamic variable names tagging with the current stage like we have done.

We have already deployed a Dev stage as it’s the default deployment. Let’s deploy a QA stage. Execute the following command.

serverless deploy --stage qa

Once the deployment is done you’ll see that for every resource that was created in Dev stage, there will be an equivalent QA resource created now representing the QA stage. Following is the two sets of Lambda functions for Dev and QA stages which you can now test separately.

--

--