MASTERING JAVA COLD START ON AWS LAMBDA — VOLUME 1

9 min readMar 29, 2019

Java is one of the most popular programming languages for years. Many enterprise companies still heavily use Java programming language and have legacy applications written in Java. But when it comes to AWS Lambda (or serverless in general), it has some challenges and cold start is the most infamous one. This is one of the challenges they face when they want to migrate to AWS Lambda as rewritten all the applications with another programming language like Node.js, Python or Go may not be feasible for them.

With the “Mastering Java Cold Start On AWS Lambda” blog post series, where this is the first one and others will come next, we are starting a new blog post series focused on cold start problem for Java based AWS Lambda functions. At each blog post, we are going to work on different improvements to reduce cold start time and share the experiments.

Let me first describe cold start briefly before diving into cold start optimizations.

Cold Start?

“Cold start” in the serverless world means that serverless application is started and initialized to handle the request. In here, “serverless application” term represents both application and container itself where user code runs. As you guess, this initialization adds extra latency to the execution of the request since they need to be done before handling the request.

Fortunately, this initialization doesn’t occur at every request as almost all the serverless platforms are smart to reuse containers as much as possible. However depending on the serverless platform itself, existing containers can be destroyed and new ones can be created at any time (of course not in the middle of the invocation handling) due to many internal (resource scheduling/sharing, fixes/patches on the host environment, etc …) or external (new application deploy, configuration change, etc …) reasons. If we are talking about AWS Lambda platform, the following reasons trigger new container starts which causes cold starts:

there is no container alive
there are containers alive but none of them are available as all of them are busy with handling other requests
new application was deployed so new containers must start with the newer version of the application
configuration (env. variable, security groups, memory limit, etc …) was changed so new containers must start with new configurations

How to live with “Cold Start”?

As explained above, you can run away from cold start, but you can’t hide.

However, you can reduce cold start overhead:

You can try to reduce cold start occurrences by sending periodic warmup requests. However, you will still see cold starts but less. As Thundra, we had open-sourced our warmup plugin https://github.com/thundra-io/thundra-lambda-warmup which basically sends periodic warmup messages concurrently to try keeping multiple containers alive. All of our agents have warmup support out-of-the-box by detecting empty warmup messages and skipping them without passing them through your actual handler. If you are interested in with our cold start experience, you may have a look at our blog post https://medium.com/thundra/dealing-with-cold-starts-in-aws-lambda-a5e3aa8f532
In case of cold start, the environment can be optimized to start and initialize faster for minimizing cold start latency. In here, environment initialization optimization means both of faster

- container allocation/provision/start

- application startup

Let’s talk a little about what effects cold start latency and how to reduce/optimize it.

Regular cold start (screenshot from the video)

As shown in the diagram above, which is from “Become a Serverless Black Belt: Optimizing Your Serverless Applications” session at AWS re:Invent 2017, while some parts (downloading the application and starting a new container) of the environment initialization can be optimized by AWS, the other parts (bootstrapping runtime and starting application code) are up to us.

In fact, “bootstrapping the runtime” can be optimized by both AWS and us:

We can develop our Lambda function with the runtimes which have less bootstrap overhead like Go, Python, Node.js instead of Java and .NET.
And AWS can optimize the runtime bootstrapping phase to start faster (for ex. at Java runtime, by tweaking JVM arguments)

However, with the AWS Lambda Custom Runtime support (https://aws.amazon.com/tr/about-aws/whats-new/2018/11/aws-lambda-now-supports-custom-runtimes-and-layers/), it is now possible to optimize bootstrapping runtime for us. As Thundra, we have already provided our custom runtime for both of Java and Node.js runtimes and with Java custom runtime, you can get faster Lambda application startups as explained here: https://docs.thundra.io/docs/java-custom-runtime-and-layer-support

Another thing which increases cold start latency significantly is VPC (Virtual Private Cloud). VPC is a private network which helps you strictly control the inbound and outbound network traffic. If security is a big concern for you or you have your upstream services to be consumed already behind a VPC, you might need to deploy your Lambda functions in that VPC.

The problem with having a Lambda function in a VPC is that VPC introduces new extra latencies to Lambda container initialization. As shown below, these latencies are caused by creating an ENI (Elastic Network Interface) and assigning Lambda container itself that IP. In many cases, VPC overhead might be closer to 10 seconds

Additionally, another problem with VPC for Lambda is that you may run out of available IP addresses in the VPC as each Lambda container requires an IP address.

Cold Start within a VPC (screenshot from the video)

Running JVM on AWS Lambda Like a Pro

Java is one of the runtimes that suffers from cold start overhead more on AWS Lambda.

The reasons behind the startup overhead are:

Classes (contains codes to be executed) need to be loaded (read, parse, verify) into memory and initialized.
At first, loaded codes run in interpreter mode initially which means bytecodes are executed one by one by Java virtual machine on the JVM stack as it slows the execution relative to native machine code execution. Fortunately, thanks to JIT (JVM’s just-in-time compiler), after some profiling period, hot (means runs many times) bytecodes are compiled into optimized native code which runs faster. In this context, there is a potential opportunity for GraalVM SubstrateVM (hopefully we will cover this topic at next articles) on AWS Lambda which compiles Java source files and bytecodes into native codes during compilation (not at runtime), so Java applications can start faster. But, GraalVM is not ready for prime time yet on AWS Lambda as it has some limitations.

However, there are things that we can do in our application to start Java-basedJava based Lambda functions faster.

In this blog post series, we have a simple AWS Lambda function, named “book-get-service”, which searches book in AWS DynamoDB for the given id. “book-get-service” gets book id from the request and returns the book entity in the response as shown below.

Request

{ “id”: “1” }

Response

{    “book”: {        “id”: “1”,        “name”: “Harry Potter and the Philosopher’s Stone”,        “author”: “J. K. Rowling”,        “publicationDate”: “Thu Jun 26 03:00:00 EEST 1997”    }}

For the benchmarks, “book-get-service” function is implemented in Java language and runs on Java 8 runtime.

In our experiments, we measure the passed time between load time of the handler class and completion of the first request by the handler (return response to the AWS Lambda runtime) for cold start invocations to just focus on the user code’s initialization time. So this means that we are not counting;

Network latency for a request to function from the caller
Container provisioning for AWS Lambda function
AWS Lambda function Java process start
Network latency for a response from function to the caller

BENCHMARK 1 — HTTP vs HTTPS

At the first benchmark, we compare HTTP and HTTPS based communications to access AWS DynamoDB endpoint for getting book item. The benchmarks are repeated with different memory limits which effect given proportional CPU (and sometimes even given physical core size especially after 1.5GB) to your Lambda functions.

Note: For the first benchmark, we use AWS Java SDK 1 with version 1.11.330.

Here are the results:

As shown from the results, HTTP based communication has less cold start overhead than HTTPS based communication as HTTPS based communication needs

TLS handshake
Loading (read, parse and verify) tons of security related classes
Initialization of security components (ciphers, etc …)

Another point worth to mention is that as you can see, there is much improvement in the HTTPS based communication while memory limit is increasing which increases given proportional CPU as mentioned before. The reason is that due to the nature of encryption, initialization and processing parts of HTTPS based communication is more CPU intensive tasks. Therefore, its performance is effect by CPU much.

BENCHMARK 2 — AWS SDK 1 vs AWS SDK 2

For this benchmark, we compare AWS Java SDK 1 and AWS Java SDK 2 libraries which handle the communication between our application and AWS service endpoint. Since in our scenario, most of the job is done by the SDK, coldstart overhead is mostly effected by their performance.

AWS Java SDK 2 has much improvements over AWS Java SDK 1 like non-blocking IO support, pluggable transport layer, etc … (https://aws.amazon.com/tr/blogs/developer/aws-sdk-for-java-2-x-released/) Besides that, AWS Java SDK 2 has been also redesigned and improved to initialize faster which is important especially for AWS Lambda environment (https://docs.aws.amazon.com/en_us/sdk-for-java/v2/developer-guide/client-configuration-starttime.html). You can have a look at the comments here on GitHub written during development of AWS Java SDK 2: https://github.com/aws/aws-sdk-java-v2/issues/6

Note: For this benchmark, we use AWS Java SDK 1 with version 1.11.330 and AWS Java SDK 2 with version 2.2.2.

In our benchmark, for configuring AWS Java SDK 2 to use transport layer based on pure JDK:

Add `url-connection-client` transport layer dependency for pure JDK based transport layer. Which means fewer classes will be loaded.

<dependency>    <groupId>software.amazon.awssdk</groupId>    <artifactId>url-connection-client</artifactId>    <version>${aws.sdk2.version}</version></dependency>

Configure JDK’s URLConnection based HTTP client to be used.

DynamoDbClientBuilder builder =        DynamoDbClient.builder().            …            httpClientBuilder(UrlConnectionHttpClient.builder());

Like the first one, again the benchmarks are repeated with different memory limits for the same reason.

When we compared AWS Java SDK 1 and AWS Java SDK 2 based on both of HTTP and HTTPS based communications, we got the following results:

As shown in the results, AWS Java SDK 2 has less cold start overhead than AWS Java SDK 1 as it loads less class which can be verified from JVM performance counters. For example, with 3GB memory and HTTPS based communication as shown with performance counters below, AWS Java SDK 2 spends less time for loading application classes;

AWS Java SDK 1 Performance counters:

…sun.cls.appClassLoadCount = 2278 [Variability: Monotonic, Units: Events]sun.cls.appClassLoadTime = 972223260 [Variability: Monotonic, Units: Ticks]…sun.os.hrt.frequency = 1000000000 [Variability: Constant, Units: Hertz]sun.os.hrt.ticks = 2039607260 [Variability: Monotonic, Units: Ticks]…

AWS Java SDK 2 Performance counters:

…sun.cls.appClassLoadCount = 1724 [Variability: Monotonic, Units: Events]sun.cls.appClassLoadTime = 660129052 [Variability: Monotonic, Units: Ticks]…sun.os.hrt.frequency = 1000000000 [Variability: Constant, Units: Hertz]sun.os.hrt.ticks = 1576943977 [Variability: Monotonic, Units: Ticks]…

Final words

In this post, at first, we measured the initialization performance of HTTP based communication over HTTPS based one. And we have seen that especially for low memory limits, there is much more performance gain with HTTP based communication as initialization of components for HTTPS based communication are heavily CPU intensive tasks.

Then we benchmarked AWS Java SDK 1 and AWS Java SDK 2 performances as said that there are many improvements with the new SDK. Then we observed that AWS Java SDK 2 has 500–750 ms less initialization overhead than AWS Java SDK as it seems promising as just one optimization.

We are going to continue sharing our experiments and findings for optimizing cold start performance of Java based AWS Lambda (or serverless in general) functions with the next coming blog posts so keep your eye on our blog :)

You can sign up to our web console and start experimenting. Don’t forget to explore our demo environment — no sign up needed. We are very curious about your comments and feedback. Join our Slack channel or send a tweet or contact us from our website!