AWS Lambda + Terraform + Github Actions Deployment

Wambui Gitau
6 min readMar 7, 2023

--

AWS Lambda is an AWS component that enables you to run code without worrying about the underlying infrastructure (Serverless Computing). It scales up and down depending on the needs. It is cost effective for tasks that take less than 15mins since there is no need for idle instances.

AWS Lambda can either be Event Source (trigger from an event. Can be change of resource/data e.g S3, DynamoDB, IoT) or Functions (Piece of code that is run when a event occurs. Currently supporting 4 languages — Node.js, Python, Java 8 and C#)

In data engineering, it is mostly used for extraction or simple transformation of data. For data extraction, one can consume APIs, IoT, Webhooks just to name a few. For simple transformations like create subsets of raw dataset, renaming of columns, changing of data types. AWS Lambda have memory limitation of 128MB to 1.5GB. Therefore for complex transformations, one can use other resources such as spark.

In the remainder of this post, we will look at AWS Lambda Functions. We will learn the following:

  • Working with Localstack
  • Resources that work with AWS Lambda
  • Creating AWS Lambda using Terraform
  • Creating Lambda functions whose code will be stored on S3
  • Deploying the Lambda function code using githun actions workflow

Prerequisite

  • Terraform installed
  • Python 3.7+ installed
  • docker installed
  • Terraform Basics
  • AWS basics
  • AWS CLI

LocalStack

Localstack is a fully local cloud stack. This is used when you want to create your infrastructure without incurring server costs. For this post, we will use Localstack with AWS.

  • Installing Localstack
brew install localstack
localstack --help

AWS + Terraform

The primary resources that we will be using is S3 and Lambda functions. The secondary resource will be IAM roles. Code will be organised in a module form. The folder structure will look like below:

folder structure

I found it easier to configure terraform through terraform provider instead of using tflocal. This is because I came across the error below when creating S3:

dial tcp: lookup s3.localhost.localstack.cloud on 192.168.178.1:53: no such host

The provider file will look as below:

provider "aws" {
access_key = "test"
secret_key = "test"
region = "us-east-1"

# only required for non virtual hosted-style endpoint use case.
# https://registry.terraform.io/providers/hashicorp/aws/latest/docs#s3_force_path_style
s3_use_path_style = true
skip_credentials_validation = true
skip_metadata_api_check = true
skip_requesting_account_id = true

endpoints {
s3 = "http://localhost:4566"
lambda = "http://localhost:4566"
iam = "http://localhost:4566"
}
}

You will run aws configure and set up dummy values.

The main file will be used to initialise the module.

module "lambda" {
source = "./lambda"
}

For S3, we will create a bucket and a bucket object. The object will be created inside the lambda file. The bucket will store the code. Below is the S3 file:

resource "aws_s3_bucket" "code_bucket" {
bucket = var.code_bucket_name
}

resource "aws_s3_bucket_server_side_encryption_configuration" "code" {
bucket = aws_s3_bucket.code_bucket.bucket

rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}

Lambda file will consist of the lambda function, S3 bucket object and the dummy archive file which is necessary at the beginning. If you have layers, this is where they will be placed.

The S3 bucket object states where the zipped lambda function will be stored. The archive file will zip the lambda function folder.

resource "aws_s3_object" "lambda_function" {
bucket = aws_s3_bucket.code_bucket.id
key = "localstack_test/lambda_function.zip"
source = data.archive_file.lambda_function.output_path
etag = filemd5(data.archive_file.lambda_function.output_path)

lifecycle {
ignore_changes = all
}
}


data "archive_file" "lambda_function" {
type = "zip"
source_dir = "${path.module}/lambda_function"
output_path = "${path.module}/lambda_function.zip"
}

resource "aws_lambda_function" "hello_world_function" {
function_name = "HelloWorldLambda"
handler = "hello_world.main_handler"
runtime = "python3.7"
s3_bucket = aws_s3_bucket.code_bucket.bucket
s3_key = aws_s3_object.lambda_function.key
environment {
variables = {
test_var = "hello"
}
}
}

For the roles, we will have only one role the assume role and the AWSLambdaBasicExecutionRole policy. We will then tie the role to lambda function. This can be done by adding role variable to the function:

role          = aws_iam_role.lambda_execution_role.arn

Below is the code on iam.if:

data "aws_iam_policy" "AWSLambdaBasicExecutionRole" {
arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}

resource "aws_iam_role_policy_attachment" "AWSLambdaBasicExecutionRole" {
role = aws_iam_role.lambda_execution_role.name
policy_arn = data.aws_iam_policy.AWSLambdaBasicExecutionRole.arn
}

resource "aws_iam_role" "lambda_execution_role" {
name = "helloworld-api-execution-role"


assume_role_policy = jsonencode(
{
Version = "2012-10-17",
Statement = [{
Action = "sts:AssumeRole",
Principal = {
Service = "lambda.amazonaws.com"
},
Effect = "Allow"
}]
}
)
}

Create a lambda function folder that will have the dummy lambda function file hello_world.py. The dummy code will be as below:

import os

def main_handler(event, context):

test_var = os.environ['test_var']
return {"message": test_var, "Status": 200}

When that is done, run the terraform commands to create the resources. Open a terminal and paste the command below:

localstack start

Open a new terminal to run terraform. Below are the commands:

terraform init
terraform plan
terraform apply

When the commands are completed, you can test whether the resources have been created by running:

aws --endpoint-url=http://localhost:4566 s3 ls s3://test-bucket

aws --endpoint-url=http://localhost:4566 lambda list-functions

Deployment

By default, terraform has been setup to deploy lambda functions. Ideally, IaC and Application code repos are always separated. Therefore, there is a need to not use terraform to deploy the lambda function. To do that, we will use Github Actions to deploy the code.

The following are the steps taken to deploy lambda function on AWS:

  • Create a user who will be given multiple roles. Create user.tf and also create access key id and secret key. This will be needed for authentication.
resource "aws_iam_user" "gh_actions" {
name = "github_action_deploy_user"
path = "/"

}

resource "aws_iam_access_key" "gh_actions" {
user = aws_iam_user.gh_actions.name
}

resource "aws_secretsmanager_secret" "gh_actions" {
description = "GitHub actions Lambda Code user credentials"
name = "github_action_deploy_user_credentials"
recovery_window_in_days = 0
}

resource "aws_secretsmanager_secret_version" "gh_actions" {
secret_id = aws_secretsmanager_secret.gh_actions.id
secret_string = jsonencode({ "aws_access_key_id" = aws_iam_access_key.gh_actions.id, "aws_secret_access_key" = aws_iam_access_key.gh_actions.secret })
}
  • Grant the user permissions. The user will need permission to get and put objects in the S3 object. They will also need permission to update the function code. Add the below code to iam.tf
resource "aws_s3_bucket_acl" "s3_bucket" {
bucket = aws_s3_bucket.code_bucket.id
acl = "private"
}

data "aws_iam_policy_document" "code_bucket_gh_actions_policy_document" {
statement {
sid = "GHUserPermissionS3"
effect = "Allow"

actions = [
"s3:getObject",
"s3:putObject",
]

resources = [
"${data.aws_s3_bucket.code_bucket.arn}",
"${data.aws_s3_bucket.code_bucket.arn}/*"
]
}

statement {
sid = "GHUserUpdateFunctionRole"
effect = "Allow"

actions = [
"lambda:UpdateFunctionCode"
]

resources = [
"${aws_lambda_function.lambda_function.arn}" ]
}
}

resource "aws_iam_user_policy" "code_bucket_gh_actions_policy" {
name = "${aws_iam_user.gh_actions.name}-code-bucket-policy"
user = aws_iam_user.gh_actions.id
policy = data.aws_iam_policy_document.code_bucket_gh_actions_policy_document.json


}
  • Add environment variables on github. Under repo -> Settings → Secrets and Variables -> Actions

Create an environment. Then on the environment create AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY . The value can be found in the secret manager called gh_actions .

  • Zip the code. The assumption here is that this is a new repo. For the purpose of illustration, i will create a new folder outside tf to place the code.
zip -jr lambda_test/lambda_function.zip lambda_test/hello_world_.py
  • Upload the zipped code to S3 bucket
aws s3 cp lambda_test/lambda_function.zip s3://test-bucket/localstack_test/
  • Update the function code. At the moment, AWS does to automatically monitor changes to S3 buckets that have lambda functions. Therefore one needs to update the code but with S3 key stated.
aws lambda update-function-code --function-name "arn:aws:lambda:us-east-1:<acccount_id>:function:HelloWorldLambda"  --s3-bucket "test-bucket" --s3-key "localstack_test/lambda_function.zip"
  • Wait for the update to be completed. Updating function takes a while. If there are other commands such as publishing the code, it will fail unless update is completed.
aws lambda wait function-updated --function-name "arn:aws:lambda:us-east-1:<account_id>:function:HelloWorldLambda"

All the above can be put into a github action workflow which will update function when there is a change in file. Create a file under .github/workflows deploy_lambda.yml

name: Deploy Lambda Function Through S3

on:
push:
branches:
- main
paths:
- "lambda_test/hello_world_.py"

env:
AWS_REGION: "us-east-1"
S3_BUCKET_NAME: "test-bucket"

jobs:
Deploy-Lambda-Functions:
name: Deploy Function
runs-on: ubuntu-latest

environment: "dev"
steps:
- uses: actions/checkout@v3
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION}}

- name: Zip file
run: zip -jr lambda_test/lambda_function.zip lambda_test/hello_world_.py
- name: Copy File to S3
run: |
aws s3 cp lambda_test/lambda_function.zip s3://test-bucket/localstack_test/
- name: Upload Function to Lambda
run: |
aws lambda update-function-code --function-name "arn:aws:lambda:us-east-1:<account_id>:function:HelloWorldLambda" --s3-bucket "test-bucket" --s3-key "localstack_test/lambda_function.zip"
- name: Remove Zip File
run: rm lambda_test/lambda_function.zip
- name: Wait for update to finish
run: aws lambda wait function-updated --function-name "arn:aws:lambda:us-east-1:<account_id>:function:HelloWorldLambda"

This will deploy the code on push to main branch.

Summary

At the end of this post, you should be able to:

  • Create Lambda function with code stored on S3
  • Give the Lambda functions the write permissions to be deployed using github
  • Create AWS resources using Terraform and Localstack
  • Deploy Lambda function using Github Actions

The code can be found here. In the next post, we will look at how to work with Lambda that is located in a VPC.

--

--

Wambui Gitau

Data scientist transitioning to Data engineer. Will mostly be talking about Cloud Data Engineering concepts