How I Built an AWS App Without Spending a Penny — The CLI + SAM

13 min readSep 22, 2023

This is part 3 of a 6-part series. See the previous parts where we build the front end.

IAM is the most important service to understand when working with AWS, regardless of your job role. It governs not only the amount of access granted to users but also the amount of access given to individual resources. IAM is completely free, so we should take advantage of that as much as we can. When you create an AWS account, you create a root user that has full access to everything. This gives you the most freedom but is dangerous to get compromised. I’m sure you’ve heard stories about people’s accounts getting hacked and then waking up to a massive bill due to the number of resources the hackers were able to spin up. Therefore, it’s best practice to create a user with the minimum number of permissions required to perform your tasks, use strong, unique passwords, and protect these accounts with MFA (especially the root user!) You should sign in with this user account for 99% of your everyday tasks and only use the root account when required. For simplicity, I created a user with AdministratorAccess. This gives the user access to everything except the things only the root user can do as linked above, such as changing the account settings or giving access to the Billing and Cost Management console. Since I’m exploring as much of AWS as I can, I like having as many permissions as I can to experiment, but ideally, you should deny all access and give permissions one at a time for every resource you need.

As I kept building this app, I wanted to incorporate the CLI to make AWS management quicker and more automated. The instructions for installing the CLI can be found here. To use the CLI, you must authenticate with AWS. There are several ways to do this. AWS recommends using the IAM Identity Center (formerly AWS Single Sign-On) to set up SSO across multiple accounts in an organization. But I’m a one-man startup and I don’t have an active directory or SAML provider set up, nor would I prefer tying my AWS credentials to a third party like Google or Microsoft. (Not only would I need to remember which account I used to sign in, but the damage would be greater if that account got compromised. If you’re curious here’s a list of supported identity providers and SaaS applications.) So, I created an access key ID and secret access key and saved them to my machine. These must be kept secret! As soon as it’s compromised, someone will have already spun up 100 EC2 instances mining Bitcoin. Now with the AWS CLI installed, I can do almost anything I can do through the management console. The CLI docs provide useful information about each command, including every option and sample output. Make sure to browse the v2 docs since Google results tend to default to the v1 docs instead.

One bug I ran into, which is exclusive to WSL, is that if I put my PC to sleep and then wake it up, the clock gets out of sync. This causes the AWS APIs to fail since the signature requires a valid timestamp within 15 minutes of the time reported on AWS servers. Usually, I can run sudo hwclock -s to fix the clock. But sometimes, I have to fully reboot WSL to fix the issue. As of this writing, this is still an open issue: https://github.com/microsoft/WSL/issues/10006.

AWS recommends you rotate your credentials every 90 days, but they don’t provide an automated way of doing so. So, I decided to automate this process myself.

Thanos saying “Fine, I’ll do it myself” — From https://memegenerator.net

Here is the architecture diagram for the service I dubbed IAM Old:

Architecture diagram for IAM old — Architecture diagram for IAM Old

Side note: AWS provides a page with a downloadable set of icons and resources you can use to create architecture diagrams. In my case, I used draw.io to create all my architecture diagrams, but you’re free to use whichever tool you prefer.

Every day, this service will check the age of all access keys across all users. If any of them are 90 days old, it will send an email to remind the user to update their credentials. (This doesn’t rotate the credentials on their behalf since the access key secret is only visible upon creation of the key. We use a separate shell script to rotate the credentials.) To initiate a task on a regular schedule, we can use EventBridge. Previously, we would create an EventBridge rule to run a cron job, but now AWS provides a dedicated EventBridge Scheduler for that purpose. This triggers a Lambda function to check IAM for the age of all access keys across all users. If the age is 90 days, Lambda sends an email to the user via SNS. We use a dead letter queue (DLQ) to collect any errors from EventBridge or Lambda. We start the CloudFormation template with the following:

AWSTemplateFormatVersion: "2010–09–09"
Transform: AWS::Serverless-2016–10–31
#Globals
Description: >-
  A stack that sends a reminder to users whenever their access key needs to be rotated

Since the architecture is entirely serverless, we can use SAM (or serverless application model) to simplify the CloudFormation template. SAM provides short-hands for several CloudFormation resources, including API Gateway, AppSync, Lambda, DynamoDB, and Step Functions, with its own CloudFormation-like docs. It also has its own CLI with commands to easily build, test, and deploy stacks. You must add the Transform: AWS::Serverless-2016–10–31 line to create a SAM template. This will allow SAM to convert the template into a regular CloudFormation template that can be deployed. (Tip: To preview the generated CloudFormation template without deploying it, you can run sam validate — debug. While the console can only translate the code into JSON, this command can translate it into YAML for easier reading.) You are free to add regular CloudFormation resources to a SAM template. This is an optional tool that can make provisioning resources a little easier. (And since it’s tested on some of the exams, it was worth learning for myself.) The other section unique to SAM is Globals. If you’re provisioning multiple SAM resources, you can use the Globals section to define common properties across all those resources. But since I’m only creating one Lambda function, I don’t include this section in the template.

In the Resources section, we first define the Lambda function:

Resources:
  LambdaFunction:
    # Creates AWS::Lambda::Permission/Function, AWS::IAM::Role
    Type: AWS::Serverless::Function
    Properties:
      # Zip files run faster than container images (setting PackageType causes false drift)
      CodeUri: src/
      Handler: app.handler
      Runtime: python3.11 # see https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html
      Architectures:
        - arm64
      Description: Check the age of access keys for IAM users and send emails to an SNS topic
      Events:
        ScheduleEvent:
          # Schedule = AWS::Events::Rule, ScheduleV2 = AWS::Scheduler::Schedule
          Type: ScheduleV2
          Properties:
            DeadLetterConfig:
              Arn: !GetAtt SchedulerDLQ.Arn # applies the necessary resource-based policy
            Description: Check the age of access keys every day
            # You can't use * in both day-of-month [3] and day-of-week [5]
            # https://docs.aws.amazon.com/scheduler/latest/UserGuide/schedule-types.html#cron-based
            ScheduleExpression: "cron(0 0 * * ? *)" # run every day at midnight UTC
            State: ENABLED
      Policies:
        - arn:aws:iam::aws:policy/IAMReadOnlyAccess
        # SAM policy templates:
        # https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-policy-templates.html
        - SNSPublishMessagePolicy:
            TopicName: !GetAtt SNSTopic.TopicName
      # 128 MB of memory allocated by default
      # Automatically update the runtime version
      # Timeout after 3 seconds
      Environment:
        Variables:
          TopicArn: !Ref SNSTopic
      Tracing: Active
      DeadLetterQueue:
        TargetArn: !GetAtt LambdaDLQ.Arn # automatically applies permissions to the execution role
        Type: SQS
  SchedulerDLQ:
    Type: AWS::SQS::Queue
    DeletionPolicy: Delete
    UpdateReplacePolicy: Delete
    Properties:
      MessageRetentionPeriod: 345600
      ReceiveMessageWaitTimeSeconds: 5
      SqsManagedSseEnabled: true
  LambdaDLQ:
    Type: AWS::SQS::Queue
    DeletionPolicy: Delete
    UpdateReplacePolicy: Delete
    Properties:
      MessageRetentionPeriod: 345600
      ReceiveMessageWaitTimeSeconds: 5
      SqsManagedSseEnabled: true

Notice that we didn’t define a section for the EventBridge Scheduler. One of the perks of SAM is that we can define the event trigger within the Lambda block. In the Events block, we create a scheduler that runs every day at 0:00 UTC (around 8 pm my local time, which is convenient since I’m usually on my PC at that time), along with a DLQ to collect any errors. The DLQ utilizes long polling (using a receive-message wait time > 0) and encryption at rest/in transit to satisfy AWS’s best practices. Note that the cron expression is specific to AWS and not a standard syntax since it accepts 6 values. By defining the scheduler in the Lambda block, it automatically creates a policy to allow the schedule to invoke the Lambda function and send messages to the DLQ.

For Lambda, we use Python and store the source code in the src directory. The Lambda handler (aka the main function) is in app.py in the handler method. Like with CodeBuild, ARM is cheaper than x86. Under Policies, we give Lambda access to IAM and SNS. In the Lambda function, we need to reference the ARN of the SNS topic created in the template. We can pass this to the function using an environment variable, like how we passed the bucket name in CodeBuild. When Tracing is enabled, we can use X-Ray to trace Lambda invocations for debugging purposes. Finally, we created another DLQ to catch any Lambda failures. Lambda automatically updates the execution role to make requests to X-Ray, CloudWatch Logs, and SQS.

Another perk with SAM is that besides making Lambda definitions cleaner, it also makes deploying Lambda changes easier. Normally in CloudFormation, you either had to define the code within the template (which is a big no-no in my opinion since you lose the ability to lint and validate your code in a text editor) or store the code somewhere in S3. But if you wanted to update the code without changing the infrastructure, you would have to upload the code to S3, update the S3 key (or object path), and trigger another deployment. This can be cumbersome at scale. With SAM on the other hand, you can run sam build to package the Lambda code, and then with sam deploy, it automatically converts the SAM template into a CloudFormation template with the appropriate S3 path to the Lambda code. This means we don’t need to update the SAM template whenever we want to update the Lambda function. I recommend adding the same lifecycle rule from the frontend template to delete the SAM artifacts after one day. Also, SAM gives us the ability to test API Gateway and Lambda functions locally without AWS. It uses Docker to spin up the containers to simulate an AWS environment.

Optionally, you can configure the DeploymentPreference of the Lambda function to integrate CodeDeploy with every function update. CodeDeploy is a free service (outside on-prem) that allows you to gradually deploy Lambda functions. You can do a canary deployment to update a function for a small percentage of users, then deploy to 100% of users after some time. Or you can update the function linearly until the traffic reaches 100%. This works by utilizing multiple aliases and versions of the Lambda function. Versions are immutable snapshots of your function as your code updates, similar to tags in Git. Aliases point to specific versions and can be changed. By default, Lambda uses the $LATEST alias to point to the latest version of a function. But for canary and linear deployments, you can have 2 different aliases pointing to a newer and older version of a function and configure the weights (or routing percentage) of each version. You can create hooks (aka more Lambda functions) before and after traffic is allowed to do integration testing, load testing, performance testing, etc. to ensure your new Lambda function will work as expected in prod. But since these are relatively simple functions with infrequent updates, I omitted this section in favor of all-at-once deployments. The Lambda function will be updated instantly for all users without creating new versions.

The following is the code for the Lambda function:

import boto3
from datetime import datetime, timezone
import os

# It's more efficient to only initialize boto3 clients during cold starts
iam = boto3.client("iam")
sns = boto3.client("sns")

def handler(event, context):
    users = iam.list_users()["Users"]
    reminders = check_all_access_keys(users)
    send_reminders(reminders)

def check_all_access_keys(users):
    # AWS timestamps are in ISO 8601 format with time zones
    today = datetime.now(timezone.utc)
    MAX_DAYS = 90
    reminders = []

    for user in users:
        username = user["UserName"]

        for access_key in iam.list_access_keys(UserName=username)["AccessKeyMetadata"]:
            if access_key["Status"] != "Active":
                continue

            access_key_id = access_key["AccessKeyId"]
            creation_date = access_key["CreateDate"]
            # boto3 automatically converts date strings into datetime format
            delta = (today - creation_date).days

            if delta == MAX_DAYS:
                reminders.append(
                    {
                        "username": username,
                        "access_key_id": access_key_id,
                        "creation_date": creation_date,
                        "delta": delta,
                    }
                )

    return reminders

def send_reminders(reminders, sns=sns):
    TOPIC_ARN = os.environ.get("TopicArn", "")

    for reminder in reminders:
        sns.publish(
            TopicArn=TOPIC_ARN,
            Subject=f"Hey {reminder['username']}, your AWS access key is {reminder['delta']} days old.",
            Message=f"It's that time of year again! The access key ID in question is {reminder['access_key_id']} and was created on {reminder['creation_date']:%A %B %d, %Y}.",
        )

In Python, boto3 is the name of the AWS SDK. (Fun fact: A boto is an Amazon River dolphin. The connection suddenly makes sense!) We can either initialize boto3 as a client or as a resource. Clients map one-to-one with the AWS API, while resources provide a high-level abstraction of the API. I prefer clients since it’s easier to cross-reference the methods with the CLI docs. Plus, AWS announced that it would no longer support resources for future updates, as stated here.

One thing worth noting is that we initialize the boto3 clients outside the handler function. This is an optimization tactic to make the Lambda function quicker. Lambda is designed for short executions, and by default, we only have a time limit of 3 seconds. Whenever you call a Lambda function, you’re telling AWS to provision an environment (kind of like a container) to run your code. If the Lambda function hasn’t been executed in a while, Lambda spends some time starting up this environment. This is known as a cold start. Once the environment is set up, the handler code is run. If the Lambda function is getting called more frequently, Lambda uses its existing environment to run the code. This is known as a warm start. The diagram below summarizes the timeline for every Lamba invocation. The initialization code is written at a global scope in Python. During warm starts, the global code is already run and only the handler function is called. This way, we don’t need to initialize the boto3 clients every time the Lambda function is invoked, since it’s a (relatively) expensive operation to set up the AWS environment.

Lambda cold start to warm start timeline — *From* *https://aws.amazon.com/blogs/compute/operating-lambda-performance-optimization-part-1/*

If you need to install additional dependencies beyond boto3, you can add a requirements.txt file in the same CodeUri path. sam build will detect it and add all the dependency files local to the Lambda function. Alternatively, you can define a layer in the SAM template to separate it from the main code. Adding all the dependencies locally will make deployment times slower, but the cold start times will be faster since there’s only one layer to initialize. Remember to keep S3’s 435 MB free tier limit in mind when you build dependencies. What I like to do is place the requirements file outside the code directory. This way, for tests, I can create a virtual environment and install libraries like boto3, while SAM can ignore this file during builds. (By the way, I highly recommend reading this AWS blog to learn how to test Lambda functions, including mocking the services.)

Below the Lambda function in the SAM template, we define the SNS topic Lambda will publish:

Parameters:
  Email:
    Type: String
    Description: The email address that will receive SNS notifications for the access keys
    # Simple regex from https://stackoverflow.com/a/201378
    AllowedPattern: "(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])"
Resources:
  SNSTopic:
    Type: AWS::SNS::Topic
    Properties:
      # Encrypt using the default SNS SSE key
      # Key aliases: aws kms list-aliases
      KmsMasterKeyId: alias/aws/sns
      Subscription:
        - Endpoint: !Ref Email
          Protocol: email
  SNSTopicPolicy:
    Type: AWS::SNS::TopicPolicy
    Properties:
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Sid: AllowSSLRequestsOnly
            Effect: Deny
            Principal: "*"
            Action: sns:Publish
            Resource: !Ref SNSTopic
            Condition:
              Bool:
                "aws:SecureTransport": false
      Topics:
        - !Ref SNSTopic

Like with S3 and SQS, we enforce encryption at rest and in transit. At rest, it encrypts using an AWS KMS key. In transit, the topic policy denies all insecure requests by checking if “aws.SecureTransport” is false. For the subscription, we need to supply a valid email address. We utilize a parameter to let the user enter their email when first deploying the stack. In all subsequent deployments, you can reuse the same parameter value. We do this by adding the following to the generated samconfig.toml file under [default.deploy.parameters]:

parameter_overrides = "ParameterKey=Email,UsePreviousValue=true"

Once the stack is deployed, you will receive an email to confirm your subscription to the SNS topic.

When I receive an email that my credentials are too old, I run the following script to rotate them on my machine:

#!/bin/bash
# Rotate the AWS access keys every 90 days, based on:
# https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_RotateAccessKey
USERNAME=YOUR_USERNAME

# Get the current access key ID (-r removes "" from strings)
OLD_ACCESS_KEY_ID=$(aws iam list-access-keys --user-name $USERNAME | jq -r ".AccessKeyMetadata[0].AccessKeyId")

# Generate new credentials and save them to ~/.aws/credentials
echo "Generating new credentials..."
read NEW_ACCESS_KEY_ID NEW_SECRET_ACCESS_KEY < <(echo $(aws iam create-access-key --user-name $USERNAME | jq -r ".AccessKey.AccessKeyId, .AccessKey.SecretAccessKey"))
aws configure set aws_access_key_id $NEW_ACCESS_KEY_ID
aws configure set aws_secret_access_key $NEW_SECRET_ACCESS_KEY
echo "Saving new credentials to ~/.aws/credentials..."
sleep 10 # wait for the file to be saved to start using new credentials

# Delete the old credentials
echo "Deleting old credentials..."
aws iam update-access-key --user-name $USERNAME --access-key-id $OLD_ACCESS_KEY_ID --status Inactive
aws iam delete-access-key --user-name $USERNAME --access-key-id $OLD_ACCESS_KEY_ID
echo "Your access keys have been rotated successfully!"

At the top, make sure to supply the name of the user you want to store the rotated credentials. First, we save a reference to the old access key so we can delete it later. Since the AWS CLI can return everything in JSON, we can use jq to parse the JSON responses. Next, we create a new access key and save it using aws configure. Remember that this is the only time you get to view the secret access key. Then we can invalidate the old access key and delete it. In between creating and deleting access keys, we wait 10 seconds. From my testing and browsing similar scripts, this is needed to ensure the new credentials are fully saved on disk.

In the next part, we will use SAM to build the back end for our app.

The full GitHub repo can be found here: https://github.com/Abhiek187/aws-shop

How I Built an AWS App Without Spending a Penny — The CLI + SAM

Written by Abhishek Chaudhuri