One of the sparsely-documented topics that frustrated me coming up to speed on AWS Lambda, was how variables declared outside the handler function code are handled across the lifecycle of lambda functions. In this article, I’ll explain this behavior with some examples that demonstrate the advantages of using global variables and a few pitfalls that you should be aware of along
Note: if you’re totally new to AWS Lambda or serverless programming, I suggest you go through the AWS “Getting Started” section first, before returning to this article.
When a lambda is triggered, AWS does the following:
- it launches the execution context (i.e. a virtual container)
- it runs the lambda handler function
Of course, launching the execution context actually has many sub-parts, but for the purpose of this article, the key thing to remember is that this step involves significant latency. (For an in-depth look at how this latency varies, I refer you to this excellent article by Mikhail Shilkov).
The good news is that AWS keeps the execution context around between lambda invocations in order to reuse resources declared outside the handler function. As a developer you can take advantage of this behavior by making global declarations of reusable or cached resources outside the handler function. This is considered an AWS best practice.
Let’s take a closer look. Here’s a lambda that does a simple database query and returns the results.
Note that the
pool variable is declared outside the handler function. I’ve taken the liberty of registering some
console.log calls when:
- the pool is created
- a connection is created in the pool
- an existing connection is acquired from the pool
- a connection is released back into the pool for reuse
This handler function is declared as an asynchronous function, meaning that it will return a promise back to the AWS execution context at the conclusion of execution — in this case, with the results of the query.
To illustrate the connection pooling in action, I deployed this function onto my AWS account and then created a simple CloudWatch Rule to trigger it every minute. Letting it run for 3 minutes, here are my logs:
As you can see from the highlighted log entries, the connection pool is initialized once. Do you notice the “undefined” verbiage immediately before the “INFO Initializing connection pool…” message? This is normally where AWS would automatically add the RequestId of the lambda request to the CloudWatch log message. However, because the AWS execution context is being initialized at this point, the RequestId is not yet generated, and we see a value of “undefined”. This is a hint that the log is taking place during execution context initialization.
Shortly afterward, there’s a “START…” log message, indicating the beginning of a handler function invocation. You can see that a connection is created, used, and then returned to the connection pool. But is the pool actually working? Yes! On each subsequent invocation, the pool is not re-initialized. Instead, we see the connection being acquired from the pool and then released back to the pool — exactly as intended. We know it’s the same connection, because the connection
threadId is identical.
The connection pool object is not destroyed between handler invocations. It persists and remains available to be used for the next request (up to a point). This is referred to as keeping the execution context “warm”. It is a concrete example of an AWS best practice:
You should minimize instantiation of reusable resource-intensive constructs in the handler function — instead, do this outside the handler to reap the benefits of context reuse.
However, AWS does not keep the lambda execution context alive forever. Conceptually, for AWS’s “pay only for what you use” serverless business model to be viable, a balance must be struck between keeping resources “warm” and recovering resources when they are no longer actively being used.
But when are resources reclaimed by AWS? This is not explicitly stated in any official documentation that I could find, probably because Amazon’s own infrastructure changes and evolves over time. Current evidence suggests, though, that lambda execution contexts are reclaimed after about 10 minutes of inactivity. My personal observations support this as well. What happens once the execution context is destroyed? The next function invocation will re-initialize it, but as mentioned at the beginning of the article, a significant latency penalty is incurred.
To demonstrate what happens, let’s continue. First, I disabled the CloudWatch rule that was running the lambda every minute. Then, after allowing for a significant cooling-off period (~17 minutes in this case), I manually triggered my function again once. Here are the CloudWatch logs:
We see the “undefined INFO Initializing connection pool…” message, which shows that the execution context has re-initialized. This re-initialization is called a “cold start”. After the cold start, you can see the same pattern as above: a connection instantiation, acquisition, use, and release of the connection back to the pool, which is nice.
mysql package is actually handling most of the heavy lifting of initializing the connections within the connection pool, and therefore I don’t have to take too many special precautions in my handler code. If you’re authoring your own caching mechanisms, you’ll want to perform null-checking in your handler code (or your cache) to ensure you don’t get unpleasant surprises due to cold starts like this.
Judging by the timestamps in the logs you can see that the latency involved with the cold start isn’t too bad, but this is a small function — only depending on a
mysql npm package. The more code dependencies you have, the longer the context initialization will take. It’s also important to note that the timestamps shown in my CloudWatch logs are not truly representative of the entire amount of latency associated with context initialization.
- DO declare resources outside your handler that you can potentially reuse. This will give you great performance benefits as long as you are careful!
- DO take care with your handler function/cache code to check for nulls and/or lazy-initialize that resources that may have been cleared out by cold-starts.
- DON’T assume anything you stored in a global variable (i.e. anything declared outside the handler function) will still be in that global variable after a cold-start.