Lambda Optimizations 101: Everything You Need to Know

Matan Cohen Abravanel
Melio’s R&D blog
Published in
5 min readMay 18, 2023

We recently began optimizing the front-end’s requests to API gateways backed by AWS Lambda. This investigation has been fascinating, as the results will determine whether we stick with Lambda or transition to a container-based invocation, such as Fargate or EC2.

I am going to share with you all the tips and tricks we’ve learned. Our tech stack includes SAM (Cloud formation), API Gateway, Lambda, Node.js (Typescript), and Aurora Postgres.

Identifying Where Your Lambda Spends Its Time:

Before we start, It is essential to use the appropriate tools for Lambda tracing.

  1. Lumigo’s “Timeline” tab is an excellent tool when investigating a specific Lambda function. It provides insights into time spent on resources and requests (Your lambda function -> Timeline)
(Lumigo -> Your lambda -> Timeline)
  1. Datadog “APM” tab enables you to view specific function latency, which is incredibly useful. (Credit Nir Sivan)
(Datadog -> NPM -> Your lambda)

Lambda Lifecycle Optimizations:

From our research, Lambda optimizations can be divided into three categories:

  • Cold Start Optimization: This is the time it takes for AWS to load and unpack your code’s zip file.
  • Warm Start Optimization: This refers to when you receive an already initiated Lambda function, either by AWS provisioned concurrency or a reused Lambda function (This will be most of where we focus).
  • Code Execution Optimization: This involves the duration of the handler’s run time.

Let’s Dive into Optimization!

Cold Start Optimization:

The primary way to reduce cold start time is by reducing your package size. This can be achieved by utilizing tools like ESBuild or Webpack, and by optimizing your imports to fetch specific resources.

However, in some cases, reducing package size might be challenging. In such instances, relying on provisioned concurrency might be beneficial.

Warm Start Optimization:

When you configure provisioned concurrency, you receive a “warm” Lambda function. Ensure the “warm” Lambda function is as prepared as possible to avoid unnecessary time wastage. For instance, with database connections, take the following steps:

  1. Initialize Connection: Call DB.init() outside your Lambda handler. Using “global await” is preferred, our tests indicate that even a synchronous call to the function can significantly affect performance.

2. Function Architecture: The default Lambda architecture is x86_64, but arm_64 is faster (Up to 25%!) and cheaper. Consider using it: AWS Lambda Architectures. Update your Lambda in the SAM file accordingly.

3. Provisioned Concurrency: This determines how many warm Lambdas AWS keeps for you. However, be aware that this feature can be expensive. Configure it carefully, taking into account the expected concurrent API calls.

In the following code, if Lambda is running in a production environment, we will use 10 provisioned concurrency; otherwise, we will use 1.

4. Provisioned Concurrency Scaling: You can request AWS to scale up your provisioned concurrency when a certain percentage is reached.

In the following code, we will instruct AWS to scale up to 100 provisioned concurrency once it reaches 60% of the current capacity, with a delay of 30 seconds between each added provision.

5. AWS_NODEJS_CONNECTION_REUSE_ENABLED: Consider adding this variable to the Lambda with a value of “1”. This enables the process to re-use existing connections.

- Only relevant at AWS SDK V2.

(Credit Or Cohen)

6. Memory Allocation: I find the concept of memory allocation fascinating. The idea that increasing memory could significantly reduce latency is exciting in itself.

From a pricing perspective, even though allocating more memory is more expensive, there is a possibility that your Lambda function may run much faster, creating a ‘sweet spot’ where the execution speed is maximized while the cost remains the same.

You can explore and test for this sweet spot at the following link. Based on our research, 1769MB often emerges as the optimal memory allocation in most of our use cases. (Credit Or Cohen & Yariv Kohn)

- In some cases, AWS will actually increase vCPU when expanding memory.

7. Code Execution Optimization:

Review your code and identify if there are queries that can be consolidated into one. If that isn’t feasible, utilize Promise.all to execute them concurrently, which is significantly faster.

Remember, optimizing your Lambda functions requires an understanding of where time and resources are being consumed. Using the right tools and strategies can drastically reduce latency and improve overall performance.

As a general note, we’ve managed to reduce the p95 latency from 7 seconds down to 350 milliseconds

p95 before and after changes

Credits:

Stas Wishnevetsky, Or Cohen, Nir Sivan & Yariv Kohn

--

--

Matan Cohen Abravanel
Melio’s R&D blog

The man who does not read has no advantage over the man who cannot read