Best Practices for AWS Lambda Container Reuse

Optimizing Warm Starts When Connecting AWS Lambda to Other Services

AWS Lambda provides high scalability due to being serverless and stateless, allowing many copies of the lambda function to be spawned instantaneously (as described here). However, when writing application code, you are likely to want access to some stateful data. This means connecting to a datastore such as an RDS instance or S3. However, connecting to other services from AWS Lambda adds time to your function code. There may also be side effects from high scalability such as reaching the maximum number of allowed connections to an RDS instance. One option to counter this is to use container reuse in AWS Lambda to persist the connection and reduce lambda running time.

There are some useful diagrams here to explain the lifecycle of a lambda request.

The following occur during a cold start, when your function is invoked for the first time or after a period of inactivity:

  • The code and dependencies are downloaded.
  • A new ECS container is started.
  • The runtime is bootstrapped.

The final action is to start your code, which happens every time the lambda function is invoked. If the container is reused for a subsequent invocation of the lambda function, we can skip ahead to starting the code. This is called a warm start, and this is the step we can optimize when connecting to other services by defining the connection outside the scope of the handler method.

Connecting to Other AWS Services from Lambda

Example: Connect to RDS instance, AWS icons sourced from here

We have a basic and common example to run through — we want to connect to a container resource to fetch enrichment data. In this example, a JSON payload comes in with an ID and the Lambda Function connects to an RDS instance to find the corresponding name of the ID so we can return the enriched payload. Because the lambda function is connecting to RDS, which lives in a VPC, the lambda function now needs to live in a private subnet too. This adds a couple of steps to the cold start — a VPC elastic network interface (ENI) needs to be attached (as mentioned in Jeremy Daly’s blog, this adds time to your cold starts).

Note: we could avoid using a VPC if we were to use a key/value storage with DynamoDB instead of RDS.

I will go over two solutions to this task, the first is my ‘naive’ solution, whereas the second solution optimizes for warm start times by reusing the connection for subsequent invocations. Then we’ll compare the performance of each solution.

Option 1 — Connect to RDS Within the Handler

This code example shows how I might naïvely approach this task — the database connection is within the handler method. There is a simple select query to fetch the name of the ID before returning the payload, which now includes the name.

Let’s see how this option performs during a small test with a burst of 2000 invocations with a concurrency of 20. The minimum duration is 18 ms with an average of 51ms and just over 1 second maximum (the cold start duration).

Lambda Duration

The graph below shows that there are a maximum number of eight connections to the database.

No. of connections to RDS database in a 5 minute window.

Option 2 — Use a Global Connection

The second option is to define the connection as a global outside of the handler method. Then inside the handler, we add a check to see if the connection exists, and only connect if it doesn’t. This means that the connection is only made once per container. Setting the connection in this way with the conditional in place means that we do not need to make a connection if not required by the code logic.

We are no longer closing the connection to the database, so the connection remains for a subsequent invocation of the function. Reusing the connection significantly reduces the warm start durations — the average duration is approximately 3 times faster and the minimum is 1 ms rather than 18 ms.

Lambda Durations

Connecting to an RDS instance is a time-consuming task, and not having to connect for every invocation is beneficial to performance. When connecting to the database for a simple database query we achieve a maximum database connection count of 20, which matches the level of concurrency (we made 20 concurrent invocations x 100 times). When the burst of invocations stops, the connections gradually close.

Now that AWS has increased the lambda duration allowance to 15 minutes, this means that database connections could last longer and you could be in danger of reaching the RDS max connections number. The default max connections can be overwritten in the RDS parameter group settings, although increasing the maximum number of connections could result in issues with memory allocation. Smaller instances can have a default max_connections value of less than 100. Be mindful of these limits, and only add application logic to connect to the database when needed.

Using a Global Connection for Other Tasks

Lambda Connecting to S3

A common task we might need to perform with Lambda is to access stateful data from S3. The code snippet below is an AWS provided Python Lambda Function blueprint — which you can navigate to by logging into the AWS console and clicking here. You can see in the code that the S3 client is fully defined outside of the handler when the container is initialized, whereas for the RDS example the global connection was set inside the handler. Both approaches will set the global variables allowing them to be available for subsequent invocations.

s3-get-object lambda blueprint code snippet https://console.aws.amazon.com/lambda/home?region=us-east-1#/create/new?bp=s3-get-object-python

Decrypting Environment Variables

The lambda console gives you the option of encrypting your environment variables for additional security. The following code snippet is an AWS provided Java example of a helper script for decrypting environment variables from a Lambda function. You can navigate to the code snippet by following this tutorial (specifically step 6). Because DECRYPTED_KEY is defined as a class global, the decryptKey() function and logic is only called once per lambda container. Therefore, we will see a significant improvement in warm start durations.

https://console.aws.amazon.com/lambda/home?region=us-east-1#/functions and https://docs.aws.amazon.com/lambda/latest/dg/tutorial-env_console.html

Using Global Variables in Other FaaS Solutions

This approach isn’t isolated to AWS Lambda. The method of using a global connection can be applied to other cloud providers’ serverless functions as well. The Google Cloud Functions tips and tricks page gives a good explanation for non-lazy variables (when the variable is always initialized outside of the handler method) versus lazy variables (the global variable is only set when needed) global variables.

Other Best Practices

Here are some other best practices to keep in mind.

Testing

Using FaaS facilitates having a microservices architecture. And having small, discrete pieces of functionality goes hand in hand with effective unit testing. To aid your unit tests:

  • Remember to exclude test dependencies from the lambda package.
  • Separate logic away from the handler method, as you would with a main method of a program.

Dependencies and Package Size

Reducing the size of the deployment package means that downloading the code will be faster at initialization and therefore will improve your cold start times. Remove unused libraries and dead code to reduce the deployment ZIP file size. AWS SDK is provided for Python and JavaScript runtimes so there is no need to include them in your deployment package.

If Node.js is your preferred Lambda runtime, you could apply minification and uglification to reduce the size of your function code and minimize the size of your deployment package. Some but not all aspects of minification and uglification can be applied to other runtimes, eg. you cannot remove whitespace from python code but you can remove comments and shorten variable names.

Setting the Memory

Experiment to find the optimal amount of memory for the Lambda Function. You pay for memory allocation, so doubling the memory means that you have to pay double per millisecond; but compute capacity increases with allocated memory so it could potentially decrease the running time to less than half of what it was. There are already some useful tools to select the optimal memory setting for you such as this one.

To Conclude…

One thing to consider is whether applying the connection reuse method is necessary. If your lambda function is only being invoked infrequently, such as once a day, then you will not benefit from optimizing for warm starts. There is often a trade-off to make between optimizing for performance versus readability of your code — the term “uglification” speaks for itself! In addition, adding global variables to your code to reuse connections to other services can potentially make your code more difficult to trace. Two questions come to mind:

  • Will a new team member understand your code?
  • Will you and your team be able to debug the code in the future?

But chances are you have chosen Lambda for its scale and want high performance and low costs, so find the balance that fits your team’s needs.


These opinions are those of the author. Unless noted otherwise in this post, Capital One is not affiliated with, nor is it endorsed by any of the companies mentioned. All trademarks and other intellectual property used or displayed are the ownership of their respective owners. This article is © 2019 Capital One.