AWS SAM for a serverless Java application

13 min readAug 4, 2021

Introduction
What is AWS SAM
Serverless application example
Prerequisites
Java application — version 1
Java application — version 1: build and deploy with SAM
Java application — version 1: why it is bad
Java application — version 2
Java application — version 2: implementation
Java application — version 2: CI/CD with the AWS SAM Pipelines
Conclusion

Introduction

In my recent post about AWS Proton, I mentioned that AWS SAM suits better for simple serverless applications. In this post, I would like to use the same example but define the infrastructure as code via the AWS SAM framework and implement Lambda functions with Java.

What is AWS SAM

Link to the official AWS Documentation: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/what-is-sam.html

The AWS Serverless Application Model (AWS SAM) is a framework to build serverless applications on AWS. The entry point of an AWS SAM application is the template.yaml file that is usually put in source root. In fact, it is a CloudFormation template with the “Transform: AWS::Serverless-2016–10–31” directive in the file head that tells CloudFormation that it is a SAM file. Comparing to pure CloudFormation, the SAM template allows creating a simpler definition of several serverless resources. As of today (August 2021), the SAM specification defines the following resources to be defined in the SAM template:

API Gateway
Application from the AWS serverless application repository
Lambda
Lambda layer
HTTP API
Lambda layer
DynamoDB table
State machine for Step Functions

To get the full specification for the definition of these resources and their properties, see https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-specification-resources-and-properties.html.

A SAM template can contain any other AWS resources written in a CloudFormation syntax.

Serverless application example

Let’s assume we have a typical small serverless application as you can see on the following diagram. It is a CRUD API with Lambda handlers which we will implement using Java.

As the data model for DynamoDB, books objects are chosen with the attributes given below:

isbn (String) — primary key
author(String) — book author
name(String) — book name

The final version of the code is available in the GitHub repository: https://github.com/rimironenko/aws-serverless-sam-app

Prerequisites

Java 8
Maven
AWS CLI is installed and configured
AWS SAM CLI is installed (see https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html)

Java application — version 1

As the architecture is very simple, the way to go is obvious and straightforward — create a Java application built with Maven, write the code for Lambda functions and define the template.yaml file in the project root with the infrastructure.

As AWS has the Maven archetype for that, it makes sense to use it to generate a Maven project: https://aws.amazon.com/ru/blogs/developer/bootstrapping-a-java-lambda-application-with-minimal-aws-java-sdk-startup-time-using-maven/. Since Lambda functions will interact with DynamoDB, we should specify service argument equal to “dynamodb”. As a result, the completed skeleton of the Java application will be created as we planned.

Now, let’s define our DynamoDB:

BooksTable:
  Type: AWS::Serverless::SimpleTable
  Properties:
    PrimaryKey:
      Name: isbn
      Type: String
    ProvisionedThroughput:
      ReadCapacityUnits: 1
      WriteCapacityUnits: 1

And API Gateway:

BooksApi:
  Type: AWS::Serverless::Api
  Properties:
    StageName: stage
    Variables:
      LAMBDA_ALIAS: stage

As stage name is a required property, it makes sense to introduce a parameter for it to be specified before stack creation instead of hardcode, but restrict possible names by specifying the allowed values:

Parameters:
  Stage:
    Type: String
    Description: Stage name to deploy resources to
    AllowedValues:
      - dev
      - stage
      - production

How the SAM template looks like at this stage:

Now let’s create a POJO class for our data model. To simplify work with DynamoDB, let’s use AWS SDK version 2 for Java, enhanced DynamoDB client library, and Gson library for serialization and deserialization.

package com.home.sam.test;

import com.google.gson.annotations.SerializedName;
import software.amazon.awssdk.enhanced.dynamodb.mapper.annotations.DynamoDbBean;
import software.amazon.awssdk.enhanced.dynamodb.mapper.annotations.DynamoDbPartitionKey;

import java.util.Objects;

@DynamoDbBean
public class Book {

    public static final String PARTITION_KEY = "isbn";

    @SerializedName(PARTITION_KEY)
    private String isbn;

    @SerializedName("author")
    private String author;

    @SerializedName("name")
    private String name;

    @DynamoDbPartitionKey
    public String getIsbn() {
        return isbn;
    }

    public void setIsbn(String isbn) {
        this.isbn = isbn;
    }

    public String getAuthor() {
        return author;
    }

    public void setAuthor(String author) {
        this.author = author;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Book book = (Book) o;
        return isbn.equals(book.isbn);
    }

    @Override
    public int hashCode() {
        return Objects.hash(isbn);
    }

}

DependencyFactory helper class:

package com.home.sam.test;

import software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider;
import software.amazon.awssdk.core.SdkSystemSetting;
import software.amazon.awssdk.enhanced.dynamodb.DynamoDbEnhancedClient;
import software.amazon.awssdk.http.urlconnection.UrlConnectionHttpClient;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.dynamodb.DynamoDbClient;

/**
 * The module containing all dependencies required by the application.
 */
public class DependencyFactory {

    private DependencyFactory() {}

    /**
     * @return an instance of DynamoDbClient
     */
    public static DynamoDbEnhancedClient dynamoDbEnhancedClient() {
        return DynamoDbEnhancedClient.builder()
                .dynamoDbClient(DynamoDbClient.builder()
                        .credentialsProvider(EnvironmentVariableCredentialsProvider.create())
                        .region(Region.of(System.getenv(SdkSystemSetting.AWS_REGION.environmentVariable())))
                        .httpClientBuilder(UrlConnectionHttpClient.builder())
                        .build())
                .build();
    }
}

Function code:

package com.home.sam.test;

import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyRequestEvent;
import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyResponseEvent;
import com.google.gson.Gson;
import software.amazon.awssdk.enhanced.dynamodb.DynamoDbEnhancedClient;
import software.amazon.awssdk.enhanced.dynamodb.DynamoDbTable;
import software.amazon.awssdk.enhanced.dynamodb.Key;
import software.amazon.awssdk.enhanced.dynamodb.TableSchema;

import java.util.Collections;
import java.util.Map;

public class GetItemFunction implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {

    public static final String ENV_TABLE_NAME = "TABLE";

    private final DynamoDbEnhancedClient dbClient;
    private final String tableName;
    private final TableSchema<Book> bookTableSchema;

    public GetItemFunction() {
        dbClient = DependencyFactory.dynamoDbEnhancedClient();
        tableName = System.getenv(ENV_TABLE_NAME);
        bookTableSchema = TableSchema.fromBean(Book.class);
    }

    @Override
    public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent input, Context context) {
        String response = "";
        DynamoDbTable<Book> booksTable = dbClient.table(tableName, bookTableSchema);
        Map<String, String> pathParameters = input.getPathParameters();
        if (pathParameters != null) {
            String itemPartitionKey = pathParameters.get(Book.PARTITION_KEY);
            Book item = booksTable.getItem(Key.builder().partitionValue(itemPartitionKey).build());
            if (item != null) {
                response = new Gson().toJson(item);
            }
        }

        return new APIGatewayProxyResponseEvent().withStatusCode(200)
                .withIsBase64Encoded(Boolean.FALSE)
                .withHeaders(Collections.emptyMap())
                .withBody(response);
    }
}

As helping libraries, Gson and Amazon aws-lambda-java-events were used to interact with API gateway properly.

Definition of this function in the SAM template is given below:

GetItemFunction:
  Type: AWS::Serverless::Function
  Properties:
    Runtime: java8
    Handler: com.home.sam.test.GetItemFunction::handleRequest
    Timeout: 20
    MemorySize: 512
    CodeUri: .
    AutoPublishAlias: !Ref Stage
    Environment:
      Variables:
        TABLE: !Ref BooksTable
    Policies:
      - DynamoDBReadPolicy:
          TableName: !Ref BooksTable
    Events:
      ApiEvent:
        Type: Api
        Properties:
          Path: /books/{isbn}
          Method: get
          RestApiId:
            Ref: BooksApi

Highlights of the definition:

CodeUri property equal to dot tells SAM to use project sources to build the function code and do not use pure Maven commands (use sam build instead of mvn package. Otherwise, we have to specify code URI as ./target/sam-test-project.jar)
The table name is exposed to the function as the environment variable (typical pattern for Lambda)
For Lambda policy one of the existing AWS SAM Policy templates was used. See the full list at https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-policy-templates.html#serverless-policy-template-table to reuse these policies when possible
API Gateway resource and method are defined at Lambda resource level but attached to API Gateway defined at the stack before

Java application — version 1: build and deploy with SAM

To run the build, let’s execute the command:

sam build

SAM automatically selects build tools depending on Function runtime (Maven in our case) and creates the content to be packaged and uploaded at the .aws-sam folder at the same level as the SAM template itself.

Content for Lambda function code contains compiled CLASS Java files and required libraries under lib/

To deploy the application to AWS, run the command

sam deploy --guided

For the first time, you will be prompted to provide configuration for the CloudFormation stack created by SAM. This configuration can be saved in the file (samconfig.toml by default) that will be re-used for the stack update after code changes.

Once the creation/update of the stack is completed, it can be observed at the AWS console.

Let’s test the API now. Firstly, insert a test item into the DynamoDB table.

Secondly, let’s go to the API gateway and test the get method defined for the Lambda function.

Everything works fine!

Java application — version 1: why it is bad

But what will happen, if we add another Lambda function for putItem operation similar to the existing one?

Function code:

package com.home.sam.test;

import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyRequestEvent;
import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyResponseEvent;
import com.google.gson.Gson;
import software.amazon.awssdk.enhanced.dynamodb.DynamoDbEnhancedClient;
import software.amazon.awssdk.enhanced.dynamodb.DynamoDbTable;
import software.amazon.awssdk.enhanced.dynamodb.TableSchema;

import java.util.Collections;

public class PutItemFunction implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {

    private static final int STATUS_CODE_NO_CONTENT = 204;
    private static final int STATUS_CODE_CREATED = 201;
    private final DynamoDbEnhancedClient dbClient;
    private final String tableName;
    private final TableSchema<Book> bookTableSchema;

    public PutItemFunction() {
        dbClient = DependencyFactory.dynamoDbEnhancedClient();
        tableName = System.getenv(GetItemFunction.ENV_TABLE_NAME);
        bookTableSchema = TableSchema.fromBean(Book.class);
    }

    @Override
    public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent request, Context context) {
        String body = request.getBody();
        int statusCode = STATUS_CODE_NO_CONTENT;
        if (body != null) {
            Book item = new Gson().fromJson(body, Book.class);
            if (item != null) {
                DynamoDbTable<Book> booksTable = dbClient.table(tableName, bookTableSchema);
                booksTable.putItem(item);
                statusCode = STATUS_CODE_CREATED;
            }
        }
        return new APIGatewayProxyResponseEvent().withStatusCode(statusCode)
                .withIsBase64Encoded(Boolean.FALSE)
                .withHeaders(Collections.emptyMap());
    }
}

Definition of this function in the SAM template:

PutItemFunction:
  Type: AWS::Serverless::Function
  Properties:
    Runtime: java8
    Handler: com.home.sam.test.PutItemFunction::handleRequest
    Timeout: 20
    MemorySize: 512
    CodeUri: .
    AutoPublishAlias: !Ref Stage
    Environment:
      Variables:
        TABLE: !Ref BooksTable
    Policies:
      - DynamoDBWritePolicy:
          TableName: !Ref BooksTable
    Events:
      ApiEvent:
        Type: Api
        Properties:
          Path: /books
          Method: post
          RestApiId:
            Ref: BooksApi

Now we are ready to update the application in AWS via sam build and sam deploy commands. Once it is deployed, we can test the newly added API method: add a new book via just addedPOST /booksAPI and test that this book is returned by the GET /books/{isbn}API method.

But why it is a bad approach?

Let’s look into the function content build by the SAM.

We can see that the function code is fully duplicated. Moreover, both functions contain the dependencies in the lib/ folder. And let’s take a look at the size of the functions in the AWS console.

So we already know that both functions contain the same code including the dependencies. Moreover, if we define an inline function for the getItem operation in Python and deploy it, we will see that its size is much more than the size of the Java function.

GetItemFunctionPython:
  Type: AWS::Serverless::Function
  Properties:
    Runtime: python3.8
    Handler: index.handler
    Timeout: 10
    MemorySize: 128
    AutoPublishAlias: !Ref Stage
    Environment:
      Variables:
        TABLE: !Ref BooksTable
    Policies:
      - DynamoDBReadPolicy:
          TableName: !Ref BooksTable
    InlineCode: |
      import json
      import boto3
      import os

      dynamodb = boto3.resource('dynamodb')


      def handler(event, context):
          table = dynamodb.Table(os.environ['TABLE'])
          response = ''
          book_key = event['pathParameters']['isbn']
          if book_key:
            try:
              book = table.get_item(Key={'isbn': book_key})
            except ClientError as e:
              print(e.response['Error']['Message'])
            else:
              response = json.dumps(book['Item'])
          return {
              'statusCode': 200,
              'body': response
          }
    Events:
      ApiEvent:
        Type: Api
        Properties:
          Path: /books/python/{isbn}
          Method: get
          RestApiId:
            Ref: BooksApi

Therefore, the approach for the Java application should be changed to optimize the code of the functions.

Java application — version 2

To decrease the size of the functions and avoid code duplications, we need to do the following:

Package the application dependencies to a Lambda Layer

Lambda Layer is an AWS resource for Lambda functions, that is used to decouple code with business logic and all the helping resources (libraries, configuration files, custom Lambda runtime, etc.). Please see https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-concepts.html#gettingstarted-concepts-layer

2. Move Lambda functions to separate applications to build and package them independently.

The application will now look like the picture below, and it is not a Java monolithic application anymore.

Java application — version 2: implementation

Full code is available in my GitHub repository: https://github.com/rimironenko/aws-serverless-sam-app

Modules structure:

Implementation highlights:

the lambda-layer module just contains all the required dependencies for Lambda functions
the item-service-core module contains common Java classes for all Lambda functions and is added to the lambda-layer module as a Maven dependency
service modules include the lambda-layer module as dependency with scope=provided
sam build command by default does not skip dependencies with scope=provided, therefore build for the functions is customized as described at: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/building-layers.html. “Metadata” section was added to the resource definition with the reference to the file with the custom build commands.

Metadata:
  BuildMethod: makefile

The “makefile” file contains custom Maven commands for the correct build of the application. The “makefile” file content for the getItem function is given below:

build-GetItemFunction:
   mvn clean packag
   mvn dependency:copy-dependencies -DexcludeScope=provided
   cp -rf ./target/classes/* $(ARTIFACTS_DIR)/

With this approach, all the Lambda functions now are less than 2 KB. It significantly reduces the size of the Lambda functions comparing to about 6 MB in the previous version of the application.

All the dependencies went to the Lambda Layer:

Please note that the Lambda Layer module and the Core module should be installed to the local Maven repository, so Maven will be able to find these libraries during the build. The sequence of the build commands is given below:

- cd item-service-core/
- mvn install
- cd ../lambda-layer/
- mvn install
- cd ..
- sam build 
- sam deploy --guided

Java application — version 2: CI/CD with the AWS SAM Pipelines

Just a two weeks ago (on July 21, 2021) AWS introduced AWS SAM Pipelines. This new capability makes it easier to create CI/CD pipelines for a SAM application. Link to a blog post in AWS Compute blog which detailed example is given below:

Introducing AWS SAM Pipelines: Automatically generate deployment pipelines for serverless…

Today, AWS announces the public preview of AWS SAM Pipelines, a new capability of AWS Serverless Application Model (AWS…

aws.amazon.com

Let’s use the SAM Pipelines feature and generate a pipeline for the application.

In the application root, please run the command given below:

sam pipeline init --bootstrap

2. Enter “1” to choose AWS Quick Start Pipeline Templates

3. Enter “4” to choose AWS CodePipeline

AWS SAM reports that no stages were detected and suggests to setup them. Type ‘Y’ to initiate the creation of the stages.

4. Enter “dev” for the stage name. Enter “2” to use the named profile.

5. Wait until the SAM will create the resources for this stage.

6. Enter “Y” to continue to build the next pipeline stage resources.

7. Enter “prod” for the name of the second stage. Enter “2” to use the named profile.

8. Wait until the SAM will create the resources for this stage.

9. Configure the source provider: enter the number of your provider from the list, enter information about the repository and the main branch, and enter “dev” and “prod” as the names of the pipeline stages.

10. Wait until the SAM will create the pipeline files:

the codepipeline.yaml file — the CloudFormation template with the definition of the pipeline resources. It contains several comments with an explanation of how to use optional features of the generated pipeline
files in the pipeline folder —specifications of the build in the AWS CodeBuild specification format
the assume-role.sh script — the script to be used in AWS CLI build commands

All the files above should be added to VCS to make the code pipeline working.

11. Fix the build specifications:

Add the “corretto8” Java runtime because our application uses Java as in the picture given below.

Specify the commands given below to build the application correctly:

- cd item-service-core/
- mvn install
- cd ../lambda-layer/
- mvn install
- cd ..
- sam build --template ${SAM_TEMPLATE}
- . ./assume-role.sh ${TESTING_PIPELINE_EXECUTION_ROLE} test-package
- sam package --s3-bucket ${TESTING_ARTIFACT_BUCKET}
              --region ${TESTING_REGION}
              --output-template-file packaged-test.yaml
- . ./assume-role.sh ${PROD_PIPELINE_EXECUTION_ROLE} prod-package
- sam package --s3-bucket ${PROD_ARTIFACT_BUCKET}
              --region ${PROD_REGION}
              --output-template-file packaged-prod.yaml

Please note that we call mvn install for the custom dependencies to let Maven find it in the local Maven repository and resolve all the dependencies.

12. To establish a Webhook with the code source and the pipeline, run the following command with your own custom name of the pipeline stack:

sam deploy -t codepipeline.yaml --stack-name sam-app-pipeline --capabilities=CAPABILITY_IAM

Conclusion

Advantages of AWS SAM

Suits best for small serverless applications
Has several built-in templates that simplify the creation of a SAM application
Full control over the developers’ code, infrastructure code, and CI/CD flow
Syntax of a SAM template is easy to understand even for the people who do not know the AWS CloudFormation
Just introduced AWS SAM Pipelines feature allows generating CI/CD pipeline easily and quickly
SAM template supports the CloudFormation syntax to describe AWS resources, therefore you can use AWS SAM even if your application is not 100% serverless

Disadvantages of AWS SAM

Developers’ code is coupled to infrastructure code, and it brings the risk to accidentally break an infrastructure by a developer error
Medium and large serverless applications may require configuration overhead that makes AWS SAM not the best choice. During the software design, other tools and architectures except SAM should be considered
AWS SAM Pipelines currently does not recognize whether the programmer code has been changed since the latest deployment or not and updates the code of Lambda functions in any case. Therefore if your Lambda function has a long time of the cold start, you should take care of the deployment strategy and use linear/canary deployment. Please read https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/automating-updates-to-serverless-apps.html to learn more AWS SAM capabilities to configure it.

Constraints of AWS SAM

It is designed for serverless applications, therefore it is not an option if your application is not a serverless one
It is designed to use SAM CLI commands for phases of the DevOps pipeline (“DevOps eight”). Therefore if you use it, then AWS SAM seems to be not the best choice for your application