Build an AWS Serverless Application using SAM

Hongbo Liu
Aug 25 · 21 min read

This story records my first time experience with AWS Serverless Application Model (SAM).

Scenario

We have an on-premise windows software called MessageSyncProxy. It runs on multiple locations/computers in each customer’s company local network. MessageSyncProxy collects some real time transaction data generated by a legacy factory manufacturing application. Each running MessageSyncProxy instance pushes the collected transaction data to an AWS SQS (Simple Queue Service) queue created for each customer.

With all the collected transaction data queued in the AWS SQS queue, we need to develop an application to process the data in the queue and push the processed data to an external ERP system through the ERP’s open RESTful API.

Design

It is a simple application. With the data is in the AWS SQS queue, an AWS serverless application is a perfect fit for this case. SQS queue can be used to trigger the lambda function to process the data.

We decided to use the serverless architecture here for a couple of reasons:

  1. The serverless architecture can take care of the scalability with minimum efforts
  2. The queue can be used as a trigger to drive the data process which is a perfect scenario for the serverless application

We decide to build the serverless application using the AWS Serverless Application Model (AWS SAM)

  1. The SAM’s single deployment configuration works better for an DevOps oriented organization.
  2. Even though, for this simple application, we could just do all the coding and provisioning manually through the AWS developer console UI, but with that we won’t be able to version control the infrastructure code and automate the deployment to achieve the Infrastructure as code (IaC).
  3. SAM supports local testing and debugging with the SAM CLI.

We decide to code the lambda function in Node.js 8.1 which support the async/await and class.

Development environment

  • Install python3 with pip3

This is required by the AWS CLI (Command Line Interface)

Download the installation file from the official site https://www.python.org/downloads/ and run the installation file.

Remember to check the option [Add Python 3.7 to PATH].

The installation includes pip (package installer for Python)

Verify the installation with following command

python --version
pip --version
pip3 install awscli --upgrade --user

After the installation, add the folder C:\Users\{username}\AppData\Roaming\Python\Python37\Scripts to the system variables PATH , so you can use AWS CLI under any folder.

Verify the installation:

aws --version

After the AWS CLI is installed, I configure the credentials that AWS CLI will use. AWS CLIruns everything under that credentials/user account. In other words, AWS CLI can only do whatever that user account allows to do in AWS.

For security, never use the root AWS account for the AWS CLI. I create another aws IAM user account with the proper permissions by assigning it to a proper group for the development purpose and make the account MFA enabled.

When creating the user account in AWS management console, it will show the secret access key and only show once. Make sure copy and paste that somewhere.

Run the following command to configure the AWS CLI

aws configure

Fill in the Access Key ID and Secret Access Key that I get when I create the IAM user account

Verify the configuration

aws iam get-user
  • Install docker desktop for windows

Docker is needed by the aws sam cli. Aws sam cli uses the docker to simulate a Lambda-like execution environment locally to test and debug sam application.

Download the installtion file from here You will need to register a docker account if you do not have one yet.

The docker desktop for windows relies on the windows Hyper-V. I have to enable the Hyper-V in Windows first.

However, I also use VirtualBox to maintain a Windows XP application development. VirtualBox won’t work with Hyper-V. The acceptable solution is creating a two boot options.

So I can reboot the PC to the No Hyper-V windows and use VirtualBox when I need.

Also, I have to enable the CPU virtualization in order to run docker.

Docker is a memory hunter. I have to adjust the settings in the docker settings screen to restrict the memory usage as my laptop memory is not enough.

After the docker is running, I add the shared drive with docker so that docker can access the files.

  • Install AWS SAM CLI

With Docker running, we can now install aws sam cli.

pip install --user aws-sam-cli

Verify the installation

sam --version
  • Install Visual Studio Code

We use Visual Studio Code as my code editor. Visual Studio Code can run the terminal shell inside the editor. I use the bash shell that comes with git for windows.

During the programming, I run most of the shell commands run in the git bash.

I don’t use the Windows 10 WSL (windows subsystem for linux) [https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux] even though that is also an option.

After visual studio code is running. I install the aws toolkit for Visual Studio Code. This Visual Studio Code extension supports SAM cli very good. It can also manage the aws resources right in the visual studio code. But I don’t really use that part function as I still like to stick to AWS CLI.

Programming

In order to understand SAM, we have to understand AWS CloudFormation first. Essentially, SAM is an extension of CloudFormation. SAM application will be deployed as CloudFormation stack in AWS cloud Eventually.

AWS CloudFormation is one of the fundamental services in AWS cloud to support Infrastructure as Code (IaS). It uses a template file (yaml or json) to describe the AWS resources needed in the application. AWS CloudFormation uses that template file to take care of the provisioning and configuring those resources. AWS CloudFormation template specs is like a programming language for AWS cloud resources. Alternatively, without using AWS CloudFormation, we can use either the AWS management console to manually create/update/delete the AWS cloud resources as I needed, which obviously this is not ideal, especially for a devops oriented organization. Or, we can write a program to do the resources provisioning and configuring by utilizing the aws sdk/api. Isn’t that what AWS CloudFormation does and most likely can do a better job than our specific program?

  • Where to Start?

AWS SAM application examples has a SQS Event Source Example that is actually very close to what we want. We use that as our SAM application start point.

git clone https://github.com/awslabs/serverless-application-model.git

Then copy the fours files under folder [\serverless-application-model\examples\2016–10–31\sqs] to my application folder c:\batchdataprocess\. That becomes our first commit.

This example uses the Simple Queue Service as the Event Source to trigger the aws lambda function defined in the file index.js. File template.yaml is the SAM template file. Every time when there is a new message in the queue, it will call the lambda function with the message passed into the lambda function.

The AWS SAM template file is a YAML file that adheres to the open source AWS Serverless Application Model specification. We use this template file to declare all of the AWS resources that comprise the serverless application. Keep in mind, AWS SAM template is just an extension of AWS CloudFormation template.

The template is declared in YAML format. We can also use json format. We can take a quick lesson here since we never used YAML before. YAML is a superset of Json. It is more powerful than Json in terms of its capabilities. For example, the key in YMAL can have space.

There is also a Visual Studio Code extension for CloudFormation to help write the CloudFormation template. Here has a very good document for this extension.

Here is how the file template.yaml looks like:

and how the lambda function file index.js looks like:

YAML is human friendly. We can understand most of the template file by just reading it through. The CloudFormation template reference and SAM template reference are always handy and helpful.

Queue: !GetAtt MySqsQueue.Arn specifies which queue is used to trigger the event. !GetAtt in YAML is called a tag. YAML uses tags to explicitly declare types. CloudFormation implements this specific tag as the short form of the intrinsic (built-in) function Fn::GetAtt. CloudFormation uses intrinsic functions in the templates to assign values to properties that are not available until runtime.

  • SAM Package and Deploy

We modify file template.yaml to rename the lambda function as BatchSQSQueueFunction, the event name as BatchSQSEvent and the queue name as BatchSqsQueue.

With these very minimum modifications, we can test SAM’s package and deploy process to verify the concept before we go further.

We first create an aws s3 bucket which will be used to hold the deploy package.

aws s3 mb s3://batchdataprocess-deploy-package

We then use command sam package to create the sam package and save it in the s3 bucket that we just created. Meanwhile, it will output the local template file batchdataprocess-deploy-template.yaml. The local template file will be used in the next step to deploy the package . We add this file in the gitignore file since I do not want it to be part of the source code.

sam package --template-file template.YAML --output-template-file batchdataprocess-deploy-template.YAML --s3-bucket 'batchdataprocess-deploy-package'

comand sam package is an alias for command aws CloudFormation package.

Finally we deploy the application to the CloudFormation stack batch-data-processor.

sam deploy --template-file ./batchdataprocess-deploy-template.YAML --stack-name batch-data-processor --capabilities CAPABILITY_IAM

sam package and sam deploy are two AWS SAM CLI commands. Here is the SAM CLI Reference. We will use some other commands later.

Now We know how to package and deploy the SAM application.

The source code commit is here.

  • Friendly Resource Name

Here is what it looks like in AWS management console if we go to the AWS management console and navigate to this CloudFormation stack:

The interesting part is the [Resources] section. It lists all the resources created during the deployment. There are 4 resources created during last deployment: A lambda function, a SQS queue, an IAM role and Lambda Event Source Mapping(the trigger).

The queue url is not quite friendly as it is generated by CloudFormation automatically. We will configure this url in the MessageSyncProxy program to push the message data. A more friendly physical name of the resource is preferred here. So we add a QueueName property to the queue in the template:

You can add more properties to the Queue. But keep in mind, Resource names must be unique across all of your active stacks. If you reuse templates to create multiple stacks, you must change or remove custom names from your template.

We also added a function name and description to the lambda function in the template file to make the physical name of the function more friendly.

You can add more properties to the function.

Now we can package and deploy it again. We don’t need to delete the stack first before we deploy it again. Aws CloudFormation is smart enough to figure out what has been changed and just do an update. Here is a screenshot with the improvements.

If you want to delete the stack, use the following command

aws CloudFormation delete-stack --stack-name 'batch-data-processor'

That will delete all the related resources of the stack created declared by this template.

The source code commit is here.

  • Multitenancy

This serverless application actually will be used by multiple customers/companies. For security simplicity, we will create one AWS SQS queue per company. The adjusted design looks like this:

We need to pass the company’s name to the SAM template as a parameter so that we can create the SQS queue, Lambda function and other company specific resources with company name as the reousrce’s physical name suffix. Parameters enable you to pass some custom values to your template each time when you create or update a stack.

Here is how the SAM template looks like with the company parameter:

We added parameter CompanyParameter, and we use AllowedPattern to restrict the company name format. ConstraintDescription give a better error message if the value doesn’t follow the format. The queue name uses an intrinsic function fn:Sub to specify the queue name with the company parameter value as the suffix.

The intrinsic function Fn::Sub substitutes variables in an input string with values that you specify. A good article about how the function fn:sub works.

The CloudFormation is really the fundamental service for AWS infrastructure management. Have a solid understanding and experience of CloudFormation is always helpful with the SAM application development, especially when you code the template yaml file. Strongly recommend to read through the CloudFormation user guide if you have time.

Now we package the application again:

aws CloudFormation package --template-file template.yaml --output-template-file batchdataprocess-deploy-template.yaml --s3-bucket 'batchdataprocess-deploy-package'

and then deploy the application:

aws CloudFormation deploy --template-file ./batchdataprocess-deploy-template.yaml --stack-name batch-data-processor-eagle --capabilities CAPABILITY_IAM --parameter-overrides CompanyParameter=eagle

This time we pass the company parameter’s value [eagle] to the stack. We also give the stack a new name [batch-data-processor-eagle].

After the stack is created, we can see the stack in the management console:

We can see now the queue is named as [batch-sqs-queue-eagle].

The ability of passing parameters to the template gives SAM application great flexibility.

The source code commit is here.

  • IAM user for the queue

We need to create an IAM user with the permission of accessing the queue. Then we can give the user’s credentials to the program MessageSyncProxy. The program MessageSyncProxy will use this user’s credentials to push the data to the queue. Again we can manually create the IAM user with the permission of access that queue and download the credentials from the AWS management console. But we have to manually create an IAM user every time when we have a new company sign up, which is an anti-devops approach. Automate everything is one of DevOps goals. Also, if a customer is not with us anymore, we need to delete all resources related to this application. We might forget to delete some resources if those resources are manually created and not defined in the template file.

Here is the template with the IAM user creation for the queue

We declared two new resources in the template: BatchSqsPublisher and BatchSqsPublisherAccessKey

Resource BatchSqsPublisher is used to create the IAM user. By providing an inline policy to the user in the template, this user only has access to the queue BatchSqsQueue. Keep the security in mind all the time when programming the serverless application.

Resource BatchSqsPublisherAccessKey is used to create the access key ID and secret access key for this IAM user, which will be used by program MessageSyncProxy to publish message in the queue programmatically.

Here are some helpful AWS IAM Template Snippets. Here is the template reference of AWS IAM AccessKey.

The access key ID and secret access key will be visible once when create the access key. In order to obtain the credentials, we use template [output] section to output the access key and secret key so that we can see those information later from the AWS management console. The Outputs section declares output values that you can import into other stacks (to create cross-stack references), return in response (to describe stack calls), or view on the AWS CloudFormation console. For example, you can output the S3 bucket name for a stack to make the bucket easier to find.

Here is how it looks like in the AWS management console:

  • IAM role for lambda

From the management console, we can see CloudFormation actually created an IAM role even though we didn’t declare that explicitly in the template. That is because the lambda function needs to be executed under a role with proper permission. CloudFormation will create one with default permissions if I don’t declare one explicitly in the template. This is called lambda’s execution role.

However, later we need to add code in the lambda function to access AWS System Manager Parameter Store where we save some configuration data. We need to modify the role’s policy to allow it access the specific AWS System Manager Parameter Store created in this stack.

In order to do that, we first declare the role in the template explicitly with the same permissions as the default one created by the CloudFormation

We assigned the role BatchSQSQueueFunctionRoleto the lambda function by setting the lambda function BatchSQSQueueFunction‘s property Role as !GetAtt BatchSQSQueueFunctionRole.Arn.

We declare the role BatchSQSQueueFunctionRolewith a unique name and two managed policies providing permissions that are required to use Lambda features:

  1. AWSLambdaBasicExecutionRole — Permission to upload logs to CloudWatch.
  2. AWSLambdaSQSQueueExecutionRole — Permission to read a message from an Amazon Simple Queue Service (Amazon SQS) queue by the event source mapping.

An AWS managed policy is a standalone policy that is created and administered by AWS. Standalone policy means that the policy has its own Amazon Resource Name (ARN) that includes the policy name. It is like aws system predefined policy.

We also use property AssumeRolePolicyDocument to associate a trust policy with this role. Trust policies define which entities can assume the role. You can associate only one trust policy with a role. In this trust policy, we use the Principal element to specify the Service Principal lambda.amazonaws.com can assume the role. The Principal element specifies the user, account, service, or other entity that is allowed or denied access to a resource. A service principal is an identifier that is used to grant permissions to a service. The element Action allows the permission of action sts:AssumeRole. which is an action from AWS Security Token Service that returns a set of temporary security credentials that you can use to access AWS resources that you might not normally have access to.

Confusing? Simply put, the Lambda function will be executed under the Role BatchSQSQueueFunctionRole that is assumed by the service principal lambda.amazonaws.com. The service principal will use the Security Token Service’s AssumeRole action to get the temporary security so that it can access the resources allowed in this role declaration. We could have the role assumed by an account or an IAM user, but having it assumed by a service principal makes more sense here because we don’t want the lambda function associated with any specify account or user.

We can deploy the template again to update the stack. Now the stack looks like this:

The source code commit is here.

  • Parameter Store

Before code the lambda function, we have one more thing to figure out. When lambda function gets the message from the queue, it will process the data and upload the result to the ERP system using the ERP system’s RESTful API. The RESTful API uses an API key/secret pair to authentication the API requests. Each company has its own API key/secret.

We can not hard code the key/secret inside the lambda function. We need to store the credentials somewhere separately. Where can we store the key/secret? In AWS, we have a couple of options. Here and here are two good articles about how to choose between Parameter Store and Secrets Manager.

We decide to go with AWS System Manager Parameter Store. For security purposes, we will encrypt the api key/secret in the Parameter Store.

First, We need to create a customer managed key in the AWS KMS (Key Management Service) and use it as the encryption key to encrypt the data stored in Parameter Store.

We add one more template parameter {UsernameParameter}

We create a customer managed key with a alias alias/batch-sqs-queue-process-encryptkey-${CompanyParameter}.

In the KeyPolicy, we give the user ${UsernameParameter} permission to create and manage the key, but can not use the key to decrypt. This policy is used when we deploy the stack (aws cloudformation deploy) which is running under my user account through the AWS CLI. Then we give BatchSQSQueueFunctionRole permission to decrypt. This permission will be used by the lambda function when the function try to read the encrypted data from Parameter Store.

You can see this key in your aws console.

In the template, we specify the key policy that:

  1. only BatchSQSQueueFunctionRole can use the key to decrypt.
  2. the user ${UsernameParameter} can manage the key but can not use the key to decrypt. ${UsernameParameter} is parameter passed to the template, it is our aws account username when we deploy the application.

This enhances the security of the stored api key/secret.

Now we need to store the API key/secret in the parameter store. Ideally, we should program the template to save the api key in the parameter store. However, at this moment, AWS CloudFormation doesn’t support the SecureString parameter type. So we have to manually add the parameter from the aws management console for now. Here is a screen shot:

We also give the lambda function role the access of the parameter store because later in the lambda function we will use the aws sdk to access the parameter store. Here is the template:

It ensures that the lambda role can only access this specific parameter store.

In the lambda function, the api key/secret will be retrieved by from the Parameter Store with decrypting. We need to pass some Environment variables to the lambda function so the lambda function can use those variables to construct the parameter name/path for the parameter. Remember each company has its own key/secret.

We add one more template parameter {EnvironmentParameter}. Now we have 3 template parameters:

We pass two variables: ENV and COMPANY. to the template file. The API key/secret paramter’s name/path looks like this:

/$(ENV)/batchsqsqueueprocess/apicredentials/$(COMPANY)

Here is how the template looks like:

Some good articles about parameter store:

  1. Sharing Secrets with AWS Lambda Using AWS Systems Manager Parameter Store
  2. Demonstrating integration between AWS Lambda and AWS SSM Parameter Store

Before we code the lambda function, we want to do a test to make sure it works as expected. We can send a message to the queue:

BATCH_SQS_QUEUE_URL=https://sqs.us-east-1.amazonaws.com/123456789012/batch-sqs-queue;aws sqs send-message --queue-url $BATCH_SQS_QUEUE_URL --message-body '{ "myMessage": "Hello SAM!" }'

Then we can try to retrieve the cloudwatch log to see if it works

aws logs filter-log-events --log-group-name "/aws/lambda/batch-sqs-queue-process-eagle" --start-time 1556135050

The source code commit is here.

  • Lambda function

Finally we can code the Lambda function.

First we want to get the api/secret key from the parameter store. Here is the code to construct the parameter store name/path for the api key/secret

Next we need to write the code to use the AWS nodejs SDK to access the AWS ssm parameter store to read the api key/secret.

Install the aws-sdk first.

npm install aws-sdk

the aws sdk api call can be in the async way:

or the callback way

We will use the async/await feature provided by nodejs v8.10 runtime. Using async/await syntax can make our code cleaner. It also can help avoid the callback hell

Here are some useful information we need to know in order to code the lambda function properly:

  1. How javascript promise works
  2. How Javascript async/await works
  3. How Nodejs event loop works
  4. Understanding the Node.js Event Loop
  5. How lambda handler works with the event loop
  6. Common Node8 mistakes in Lambda
  7. Save time and money with AWS Lambda using asynchronous programming
  8. Building Lambda Functions with Node.js
  9. Improving Performance From Your Lambda Function From the Use of Global Variables

Understanding the nodejs event loop and how lambda handler works with event loop is very helpful.

The code commit is here.

  • Http

With the api key/secret ready , we can make a call to the ERP’s RESTful api to get the access token, which will be used later to call other apis to push the data to the ERP system. Nodejs has a built-in http module. There are also some popular nodejs third-party libraries to make the http request easier.

Which one to use? This post explained the Pros and Cons of each library.

We decide to use axios, the Promise based HTTP client for the browser and node.js.

npm install axios

axios support the interceptor that can be used to refresh the access token. Some sample code:

  • Logging

Lambda function comes with a CloudWatch Logs log group, with a log stream for each instance of your function. The runtime sends details about each invocation to the log stream, and relays logs and other output from your function’s code.

To output logs from your function code, you can use methods on the console object, or any logging library that writes to stdout or stderr. The following example logs the values of environment variables and the event object.

Following is an example how to search the logging using AWS CLI.

CURRENT_TIME=$(date +%s)LOG_TIME=$(($CURRENT_TIME - 60))aws logs filter-log-events --log-group-name "/aws/lambda/batch-sqs-queue-process-eagle" --start-time $LOG_TIME

Debugging

One of the advantages that using SAM to code the serverless application is being able to debug the code locally. The SAM CLI provides a Lambda-like execution environment locally. It helps you catch issues upfront by providing parity with the actual Lambda execution environment.

We will use visual studio code to debug our serverless application locally. Visual Studio Code supports debugging of many languages and platforms via debuggers that are either built-in or

  • package.json

We use sam build to build our Lambda source code and generate deployment artifacts that target Lambda’s execution environment. By doing this, the functions that you build locally run in a similar environment in the AWS Cloud.

In order to build, we need to create the package.json file first. Run npm init under the root directory of the sam application. It will create the package.json with defaults. We modify the file as following:

  • launch.json

To set up Microsoft Visual Studio Code for step-through debugging Node.js functions with the AWS SAM CLI, use the following launch configuration. Before you do this, set the directory where the template.yaml file is located as the workspace root in Microsoft Visual Studio Code:

launch.json file basically tells Visual Studio Code how to connect to the debugger.

  • env.json

Remember we need to pass environment variables ENV and COMPANY to the lambda function? In the production, it is defined in the template file. But when we debug locally, we need to pass that to the SAM CLI sam local invoke as the parameter — env-vars, we first create the file env.json as following:

later we will pass this file to the SAM CLI sam local Invoke.

  • event.json

In production , the lambda function will be invoked by the trigger of new queue message. This won’t work with our local environment. In order to debug the lamda, we can manually create some queue message in a json file, and then pass this file to the SAM CLI local invoke as the parameter-e. SAM CLI will pass the messages in the file event.json to the lambda function when invoke the lambda function.

A sample event.json file:

  • Debug

Now we are ready to debug our sam application locally.

First, run sam build to build the package

sam build -m package.json --parameter-overrides 'ParameterKey=CompanyParameter,ParameterValue=eagle ParameterKey=UsernameParameter,ParameterValue=liuhongbo  ParameterKey=EnvironmentParameter,ParameterValue=prod'

or

npm build

It will build the application package under folder .aws-sam/, remember to add this folder to the gitignore file.

Next, run sam local invoke to run it in the docker

sam local invoke --event event.json  --env-vars env.json --debug-port 5858

or

npm start -- -e event.json

This CLI command will mount the package in the docker container lambci/lambda image with the debug port set at 5858.

A screenshot looks like this:

Finally, In the Visual Studio Code, setup the breakpoint in the source code where you want to break, and start debugging by Going to the Debug view, select the ‘Attach to SAM CLI’ configuration, then press F5 or click the green play button. Then you will see the visual studio code break at the breakpoint. From there, Visual Studio Code gives you the full ability to debug the code. .

The code commit is here.

Afterthoughts

AWS SAM is really an extension of CloudFormation, the Infrastructure as code service. AWS CloudFormation provides a common language (yaml or json) for you to describe and provision all the infrastructure resources in your cloud environment.

For the Infrastructure as code and resources provisioning, there are some other approaches from higher level that we can use your familiar programming language to code it.

  1. The AWS Cloud Development Kit (AWS CDK) is an open source software development framework to model and provision your cloud application resources using familiar programming languages.
  2. The troposphere library allows for easier creation of the AWS CloudFormation JSON by writing Python code to describe the AWS resources.
  3. stacker is a tool and library used to create & update multiple CloudFormation stacks.
  4. Pulumi — Create, deploy, and manage infrastructure on any cloud using your favorite language

However, have an experience with SAM will definitely help us understand how it works.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade