Reduce AWS Lambda Cost by Monitoring Memory Utilization

Centralizing Lambda memory metrics from Amazon CloudWatch Logs Insights.

4 min readMay 3, 2024

Introduction

AWS Lambda is my preferred compute choice on Amazon Web Services (AWS). Run code without thinking about servers or clusters? Count me in. It’s especially useful for event-driven architectures or workloads with inconsistent traffic patterns.

Lambda is also dirt cheap. If you don’t run Lambda at scale, you can close this tab and go about your day. However, if you’re a large organization that runs thousands of Lambda functions, there are opportunities to lower your Lambda bill by optimizing memory allocation.

Lambda is billed per gigabyte-second. When you create a Lambda function, you allocate memory between 128 megabytes (MB) and 10240 MB. This value is multiplied by the duration of the function to calculate cost. After your function runs, you can see the memory consumed:

I allocated 10240 MB to the function above, but it only used 35 MB. The memory is vastly over-provisioned, and here lies our savings opportunity. However, it is important to note that lowering the memory allocation doesn’t always make sense. CPU and network are tied to memory, so you may need to provision more memory than necessary if your code is CPU or network-bound. You can read more on that here.

In this blog, I’ll demonstrate a method to view Lambda memory metrics in a centralized manner. The code for my solution is available here, and is built with the AWS Cloud Development Kit and Python Lambda functions.

Architecture

I’ll start with an architecture diagram of the system I built to collect and display metrics:

1. An AWS Step Functions state machine encapsulates the logic and invokes the loader Lambda function.

2. The loader function lists all Lambda functions and their associated Amazon CloudWatch Logs log groups in the AWS Account, in AWS Regions of your choosing or all enabled Regions. The functions’ architecture (x86_64 or arm64) is also included in the payload so cost savings can be calculated.

3. Lambda functions and log groups are loaded into an Amazon Simple Queue Service (SQS) queue.

4. Another worker Lambda function reads from the queue and submits a query to CloudWatch Logs Insights (for each log group) to collect memory utilization metrics. Cost savings are calculated.

5. The query results and cost savings are sent to an Amazon Data Firehose stream, which in turns writes the data into Amazon S3 in Apache Parquet format.

6. An AWS Glue table is built on top of the data in S3 and Amazon Athena is used to query and analyze the data.

Here is a pictorial representation of the state machine, which lists Lambda functions and CloudWatch log groups and loads them into an SQS queue:

Collecting Metrics

The state machine is invoked in the following manner:

{
  "regions": "us-east-1,us-east-2,us-west-1,us-west-2",
  "days": 730
}

regions are the Regions where your Lambda functions reside. You can omit this parameter and all enabled Regions will be queried. days represent how far in the past you want to go when collecting memory metrics. The default is 30 if the parameter is not included.

For each log group, the following query is submitted to CloudWatch Logs Insights via the worker function:

    filter @type = "REPORT"
    | stats max(@memorySize / 1000 / 1000) as provisioned_memory_mb,
    min(@maxMemoryUsed / 1000 / 1000) as min_memory_used_mb,
    avg(@maxMemoryUsed / 1000 / 1000) as avg_memory_used_mb,
    max(@maxMemoryUsed / 1000 / 1000) as max_memory_used_mb,
    provisioned_memory_mb - max_memory_used_mb as over_provisioned_memory_mb,
    avg(@billedDuration) as avg_billed_duration_ms,
    count(@requestId) as invocations

The worker function calculates the potential savings using the following formula:

over-provisioned memory * average billed duration * cost * invocations

Because the minimum memory allocation for Lambda functions is 128 MB, the over-provisioned memory used in the calculation may be less than the value returned by the report from CloudWatch Logs Insights.

Viewing Data

After the state machine has finished and the SQS queue is clear of records, the following query needs to be run in Athena to load the partitions in the Glue table:

MSCK REPAIR TABLE lambda_memory_utilization;

You can view the metrics by running the following Athena query:

SELECT * FROM lambda_memory_utilization;

In the chart above, you’ll see there are four Lambda functions in my account:

lambda-log-group-loader and lambda-log-group-worker both have 128 MB allocated, so there is no room for optimization.
LotsOfMemory has 10240 MB allocated but it only ran for 2 seconds, so there is little opportunity.
LotsOfMemoryLongRuntime has 10240 MB allocated and it ran for 14.5 minutes. There is a cost-savings opportunity of $0.1466 per invocation.

The savings aren’t great for my small operation, but if you’re a large organization running Lambda at scale, there could be an opportunity to optimize your Lambda bill.

Conclusion

I hope this blog was informative and helps you save money on your AWS Lambda bill. Drop me a note if you have questions or comments.