Optimizing your Java Lambda Cold Starts and Initializations

Published in

My Local Farmer Engineering

9 min readJul 27, 2021

Java tips on AWS to improve Lambda cold start
See our Twitch session on this topic!

Pole vaulting your Java Lambdas to success

Our team has recently undergone the process of creating an entirely new Delivery Service within the AWS Cloud. In this blog post, we want to cover some of the lessons that we’ve learned while writing our new service in Java on Lambda, as well as share some best practices that we believe would benefit others just starting out on their cloud journey. For more details on the design of our service, please see our other post on building A Serverless Java Solution for Deliveries.

Disclaimer
I Love My Local Farmer is a fictional company inspired by customer interactions with AWS Solutions Architects. Any stories told in this blog are not related to a specific customer. Similarities with any real companies, people, or situations are purely coincidental. Stories in this blog represent the views of the authors and are not endorsed by AWS.

Cold Starts And Making Them Warmer

One of the most oft-repeated warnings that you will see when searching for tips about creating Lambdas in Java is that they come with the intrinsic cost of slow cold start times. This is due to the way that Java is written, and the need to start the JVM and load all of your classes prior to execution. While this cold start time is definitely more noticeable in Java compared to other languages, there are a number of things that you can do to cut down on the time it takes for your Lambda to initialize.

Avoid pulling in more dependencies than needed

Often when programming in Java, we have a tendency to add dependencies to our projects with wild abandon. And in most cases this doesn’t come with any extra costs outside of the initial server startup, as loading those extra classes causes no increase in execution time for long-running applications. Consequently, you would never notice the impact that extra dependencies have on your service. In Lambdas, however, we will hit cold starts far more frequently than we would in a typical server-deployed application. This means that when adding dependencies we need to make sure we aren’t loading extra dependencies that won’t actually be used.

For instance, many of us are probably familiar with Spring and Spring Boot. These injection frameworks can make life much easier when developing as we are able to avoid a lot of the boilerplate bloat that comes with writing services. However, they are also extremely intensive from a memory standpoint and require a huge number of classes to be loaded by your application.

A good practice is to avoid using frameworks like this when working with Lambda, as many times they are not necessary and will greatly increase your cold start times. Additionally, with services like API Gateway available many of the features of Spring can be offloaded to other services rather than built directly into the application.

Use compile-time injection as opposed to runtime injection

If you want or need to perform dependency injection because your service is large and you want it to be easily maintained, then some good alternatives to consider instead of Spring are Dagger and Micronaut. The key benefit to these frameworks is that they perform compile-time dependency injection, as opposed to runtime dependency injection.

This means that only the dependencies you have specified and compiled along with your project will be loaded when your Lambda MicroVM starts, and all of the classes that Spring must load in order to perform runtime injection will not be needed. While not as well-known as Spring, they do provide many of the functionalities of Spring Beans without as much of an impact on your cold starts. Guice is another potential framework for dependency injection that does not have as large of an impact on cold start times as Spring. However, because it also uses reflection for runtime dependency injection it will still incur a larger cost than a compile-time framework.

For our purposes, because our application is relatively simple, we didn’t end up using any dependency injection at all. However, it was a decision that had to be considered at the outset of our development journey and seeing how easy it is for cold start times to get out of hand we think that it was an important decision that we made.

Also, a good point to keep in mind is that avoiding dependency injection where possible does not necessarily mean avoiding any annotations that reduce boilerplate code and make your life easier.

For instance, Lombok is still a great choice to avoid having to write extra Builder/Getter/Setter code since all of this code is also generated at compile-time. Beyond that, it can even be added to your build as a compile-time only implementation so that the library itself isn’t included in your deployment package. Wherever possible, you should try to separate your dependencies by whether they operate at compile-time in order to avoid any extra overhead with unpacking your functions in the cloud.

dependencies {
    .
    .
    .
    implementation 'io.github.json-snapshot:json-snapshot:1.0.17'
    implementation 'commons-codec:commons-codec:1.11'
    implementation 'software.amazon.awscdk:sam:1.102.0'
    testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.1'
    testImplementation 'org.junit.jupiter:junit-jupiter-engine:5.7.1'
    testImplementation 'org.assertj:assertj-core:3.18.1'
    
    // Dependencies that are only used during compilation
    compileOnly 'org.projectlombok:lombok:1.18.20'
    annotationProcessor 'org.projectlombok:lombok:1.18.20'
}

Pick libraries carefully

As mentioned previously, one of the largest impact on function latency — particularly in regards to cold starts — is the number of classes that get loaded when the Lambda is created. Especially since any upfront initialization work that is done by those libraries has a large impact on your cold start times. One of the best ways to reduce the time needed for this initialization is to avoid using libraries with overly large dependency structures. To that end, one of the first adjustments you should make to your Java code is to use the AWS Java SDK v2 instead of Java SDK v1.

The AWS Java SDK v1 has some things that make it unwieldy when used in Lambda functions. In particular, it is tightly coupled with the Apache Http Client, and there is no option to override this client in the library. The Java SDK v2, on the other hand, gives you the ability to use a default Java Http Client by adding it into the constructor for individual services’ clients. Take the following for instance,

/**
   * Retrieves secret value from the given Secrets Manager store name.
   *
   * @param region Region the secret is in
   * @param dbSecretStoreName Secrets Manager store name
   * @return JSON object inside the store
   */
  public static JSONObject getSecret(@NonNull String region, @NonNull String dbSecretStoreName) {
    SecretsManagerClient secretsClient = SecretsManagerClient.builder()
            .region(Region.of(region))
            .httpClientBuilder(UrlConnectionHttpClient.builder())
            .build();

You can see that here we override the httpClientBuilder field on the SecretsManagerClient. If we had not done this, then the default Apache Http Client would have been used and a much larger number of dependencies would be imported. By overriding the Http Client in this way for all of your AWS service clients, you can significantly reduce the duration of your cold starts. In our experimentation, using the regular Http Client reduced our cold starts by ~500 milliseconds. For more info, see the post here that goes further into the features included in the AWS SDK v2.

Another big consideration when beginning to develop your application in Lambda is what logging framework you are going to use. In our project we use Log4j with the aws-lambda-java-log4j2 implementation, as it is an implementation meant to be used explicitly with Lambda functions. Using Log4j is helpful because it provides benefits such as customizable logging levels and formatted log output, the latter of which is exceptionally useful when searching through Cloudwatch logs for errors. However, because of the large number of dependencies required, it has the potential to add ~650 milliseconds to cold start times based on our experimentation. If speed is of the essence at all times and you don’t need a robust logging solution, it may be worth omitting Log4j in order to cut down on the cold starts. But keep in mind that cold starts only affect a small number of users, and foregoing a logging framework will usually make the operations side of your service more difficult to manage.

Tricks To Reduce Invocation Latency

One of the easiest ways to optimize a Java Lambda is to reduce the amount of time spent initializing objects with every invocation of the function. Take the following example, which initializes a SlotService object that maintains all of the database connections within our application,

/**
 * A Lambda handler for GetSlot API Call.
 */
public class GetSlots implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {
  private SlotService slotService;
  private static final Logger logger = LogManager.getLogger(CreateSlots.class);  /**
   * Constructor called by AWS Lambda.
   */
  public GetSlots() {
    this.slotService = new SlotService();
  }/**
 * Provides methods to interact with Slots in the data layer.
 */
@Data
public class SlotService {
  private static final String DB_ENDPOINT = System.getenv("DB_ENDPOINT");
  private static final String DB_REGION = System.getenv("DB_REGION");
  private static final String DB_USER = System.getenv("DB_USER");
  private static final long BACKOFF_TIME_MILLI = 1000; // One second  private Connection con;
  private DbUtil dbUtil;
  private static final Logger logger = LogManager.getLogger(SlotService.class);  /**
   * Constructor used in actual environment (inside Lambda handler).
   */
  public SlotService() {
    this.dbUtil = new DbUtil();
    this.con = dbUtil.createConnectionViaIamAuth(DB_USER, DB_ENDPOINT, DB_REGION);;
    logger.info("SlotService Empty constructor, reading from env vars");
  }

Note in particular that the SlotService object is initialized in the constructor of the Lambda handler class. This is important since initialization of the class only occurs once, when an instance of your Lambda function is first called. If the SlotService were to be created in the actual handler function itself then the database connections would have to be re-created every time the function is called. With this practice of only initializing things once we can greatly reduce the latency required for individual function invocations.

One important thing to keep in mind with this design pattern is that any potential issues with the objects initialized will cause subsequent invocations to fail. In the majority of cases this is not an issue. However, if like in this example, you create a connection to another service and that connection fails for whatever reason (database failover, etc.) then your Lambda will continue trying to use that dead connection and calls to the function will start failing. This is particularly problematic as there is no way to programmatically kill a Lambda function if a situation like this occurs. You must either wait for the Lambda to be deleted naturally over time, or manually change the settings of the Lambda function in order to refresh it.

To accommodate this issue, it is a good practice to verify that any connections that could potentially fail are still live when attempting to use them. If they are not, the connection will have to be refreshed in order for the function to proceed. One example of this is when rotating passwords or secrets that are used in your application, such as what is described this article, or if temporary IAM credentials used by your service expire. In our particular situation, we were also concerned with the possibility that our database could failover to a backup instance and we would still be using a connection that was made to the primary instance. For more in-depth information on working specifically with RDS databases within Java Lambdas, check out our other posts on the topic,

Connecting your Java AWS Lambda to an RDS database and RDS Proxy

How to use Java in your DB-connected AWS Lambdas

/**
   * Refreshes the database connection in case there is a warm Lambda that has a connection that has either closed or
   * failed to connect.
   *
   * @return the existing Connection or a new one in the case it needs to be refreshed
   */
  protected Connection refreshDbConnection() {
    Connection connection = this.con;    try {
      if (connection == null || !connection.isValid(1)) {
        logger.info("Retrying database connection");
        try {
          Thread.sleep(BACKOFF_TIME_MILLI);
          connection = this.dbUtil.createConnectionViaIamAuth(DB_USER, DB_ENDPOINT, DB_REGION);
        } catch (InterruptedException e) {
          logger.error(e.getMessage(), e);
          throw new RuntimeException("There was a problem sleeping the thread while creating a connection to the DB");
        }
      }    } catch (SQLException e) {
      logger.error(e.getMessage(), e);
      throw new RuntimeException("There was a problem refreshing the database connection due to an error while checking validity");
    }    return connection;
  }