The On-Demand wakeup pattern to overcome AWS Lambda Cold Start

Dinh-Cuong DUONG
Problem Solving Blog
4 min readAug 2, 2020

Lambda cold start is a huge problem in performance today if you are using a serverless architecture model. When we talk about the serverless architecture model, we are talking about this term in a general of use for software architecture design concept, whereas your whole software stack is deployed in many functions as a service. Typically, in AWS, we are talking about the Lambda function.

The AWS Lambda Startup Process.

In general, there are three main techniques to overcome this issue:

1. Function code and software stack optimization:

Using dynamically typed languages like Python, NodeJS, avoid using a huge preface’s programming language that depends on huge frameworks likes .NET or Java.

Break your big function as small as you can to not include heavy dependency packages.

Conclusion: This technique is required while considering to develop software in the serverless architecture model. Choosing the right runtime, a lightweight framework, minimal middleware’s attached is a critical job for a Software Architect.

Even you are doing well with this technique, AWS Lambda cold start doesn’t let you alone. You have just done with your optimization part; it’s called code optimization. The other harder part is runtime optimized; that includes fetching your code, start a container, bootstrap runtime. The runtime optimization is AWS responsibility and you have no idea how to touch in, there is a way to overcome this problem is technique number #2.

2. Keep your Lambda function always warm-up:

Using a watchdog function to warm-up your lambda farm in every period of time. Typically, a CloudWatch Event will be the event source of a Warm-up lambda, in this lambda, you invoke a bunch of your lambdas as a kick-starter. By doing so, your lambdas will keep it warm in 5 mins to 30 mins. The idle function time is not explicitly published and slightly changed for some reason.

In a practical situation, a 5 mins watchdog is used to be sure your lambdas are always up over-the-time whenever one of them is called. By a draft calculation, a lambda will be invoked 1440 times per day and 43,200 invocations per month.

Given your system is designed to serve 100 concurrent users, each user needs to invoke 100 invocations to run a completed user flow. The total of invocations for your system will be 43,200x100x100 = 432,000,000 invocations per month. It costs you 169.53 USD constantly.

AWS re-Invent 2019, a new feature has been released that called Provisioned Concurrency lambda configuration that allows enabled and set to a number of expected concurrencies for a Lambda, and all requests are managed by a pre-initialized execution environment (a lambda runtime on EC2 behind the scene).

Given an example above, one reserved instance for a 100 provisioned concurrency, each Provisioned Concurrency will serve 100 requests, it costs you 135.00 USD per month.

Conclusion: To keep all your lambdas warm-up always, you have to pay for a constant cost per month and must be designed depends on a predictable number of concurrent users. This predicted number is so hard to have an optimistic value. You will waste some periods of time in a day that there will be no user at all. Furthermore, in a high workload pick-time, many users still are stressed with the Cold Start devil.

3. Wakeup your Lambda just before you need them:

The ideal solution for solving a Cold Start problem is, it’s slightly started before it is really to be used. In the human words, this situation is similar when your friend is going to pick-up you at the downstairs, he said: “I will come in 15 mins, please prepare yourself to be in the downstairs when I come.”

The on-demand lambda wakeup pattern.

You are a Lambda and your friends are one of the clients that will call you when the time arrived. In a real use case of a client-server model, after the user logged in, your client calls an API to wake-up a bunch of lambdas that will serve the Product features. After user picked-up a product then pretend to checkout, your client calls another API to wake-up a bunch of Orders processing lambda functions, etc.,

In the simplify-graphql framework, this concept is built by default as a “PING” REST API along with a GraphQL endpoint. The payload of “PING” API is presented as a list of Operations the client will use in a couple of minutes. Every related lambda function that is linked to those operations will be waked up.

Conclusion: with this technique, you don’t have to preserve a capacity of concurrency that you have no idea to choose an ideal number or make that number auto-scaling overtime. Your clients will tell your system to prepare before they need it. This is the on-demand resource request concept that is very easy to settle.

Check out https://www.npmjs.com/package/simplify-graphql to see how it organize a large scale GraphQL project with a serverless architecture model.

Follows my articles

The wicked problem of Scalability in Cloud Computing and How to defeat?

--

--

Dinh-Cuong DUONG
Problem Solving Blog

(MSc) Cloud Security | Innovator | Creator | FinTech CTO | Senior Architect.