Solving cold starts using .NET Core — A Deeper Dive

Down the rabbit hole on what impacts cold start times

Diego Garber
Slalom Build
8 min readJul 16, 2021

--

In my previous article, I talked about how to improve response times on cold starts in AWS Lambdas.

You may have applied everything I said, and improved response times drastically. For example, maybe your original ten-second start time dropped to four seconds. It’s a great improvement, but why am I getting response times down to milliseconds when you are stuck at four full seconds? The answer is simple. In my article, I was working on a “Hello World” application, but chances are that you are working on enterprise software with multiple libraries used for mapping and data access.

Today, I’m going deeper down the rabbit hole: showing you different techniques and concepts that will help lower your response times even further. This article will be cloud agnostic (although, always using Dotnet) as it will allow you to improve times on AWS Lambdas, Azure functions and even Kubernetes clusters.

I’m grouping the reasons that affect cold start times in two sections:

1) Code and related configuration

These includes actual code changes and compilation options like enabling Ready to Run.

2) Host configuration

Configurations that are specific to the hosting technology utilized. For example, in AWS we’d choose the runtime for our Lambdas, and for functions in Azure we’d configure if they are hosted in Linux or Windows.

Code and related configuration

Ready To Run (R2R)

If you are using Serverless and not using Ready To Run (R2R) yet, please start using it right away. It’s an easy way to reduce your cold starts by 30-80%. If you don’t know how to apply it to your project, you can read the first part of this this post here. Also you can find example code here. I’ll assume you are using R2R for the rest of the guide.

Which libraries we use

Not every library takes the same time to load. If you are referencing any library that is not capable of being compiled ahead of time (AOT), the time saving benefits of R2R will not apply to those libraries. Try to avoid referencing libraries that are using .NET Core 2.1 and older (or .NET Standard 1.2 and older).

Even if the library runs in .NET Core 3 or newer, you have to take into account that not every one starts with the same speed. For example, AutoMapper and Newtonsoft are two widely used libraries that take considerable time to start and, therefore, you want to avoid them. But before you raise pitchforks, let me explain a few things about these libraries.

About AutoMapper

Even though I personally enjoy working with AutoMapper, it is tuned to deliver multiple mappings during a long period of time. Another way to say this is that it is not tuned to do the first mapping fast.

AutoMapper needs to load all of our mappings into memory, compile them and only then can it be used.

The normal use case in Serverless is to have one mapping from our input, then process the request, do a query on the database (be it SQL, EFCore or NoSQL) and then map it back to the output object.

How many different mappings do you have in your application? 100? 500? 1000? AutoMapper wants to understand every single mapping before it can even apply the first one. This is not a problem for traditional hosting but for a Serverless application, it is not good.

Imagine you have 500 mappings that AutoMapper needs to load, including reading all the properties of the classes that you’re mapping. But your initial call is very simple, mapping only 4–5 objects. Do we need to load all those 500 mappings to only map 4–5 objects? The answer is probably no. This is one of the reasons why AutoMapper is — in my opinion — not a great fit for Serverless applications.

Fortunately, there are many alternatives to using AutoMapper. Two I would recommend are either using manual mappings or considering using Mapster. To illustrate this, the following graph shows the time a console application takes to start on a Lambda, load 100 mappings and map 5,000 objects. This benchmark was established using my personal laptop in a Linux container.

Example start times for manual mappings, AutoMapper and Mapster

I uploaded the code I used and instructions on how to try this yourself in my GitHub.

Newtonsoft

Newtonsoft.json is a fantastic library, but its slower performance is the reason Microsoft created their own implementation. If you replace Newtonsoft.json with System.Text.Json, you will also see a reduction in start times!

For example, AWS recommends using System.Text.Json instead of Newtonsoft for its performance. Read more about it here.

Initial calls

The .NET assemblies are written in intermediate language (CIL) and need to be compiled before execution. The “traditional” compilation is done using the Just In Time compiler (JIT).

With .NET 3.0, Microsoft introduced Tiered Compilation (TC) and Ahead of Time Compilation (AOT). Tiered Compilation introduces two tiers of compilation, the first one is called Quick JIT and offers lower quality code but the compilation itself is performed more quickly. The second tier is optimized for machine code with the trade off that it takes longer to construct.

To put everything together, we will use R2R to create an Ahead of Time Compilation. This will allow the compiler to skip the Quick Jit (the first tier of the tiered compilation). That explained, please take a look at the following graph:

When we use a library for the first time in our code (aka “the initial call”), the compiler will generate the optimized JIT code. This compilation will be executed in parallel to the execution of our actual code.

If the initial call is smaller (an example would be serializing a small object with only one property instead of a big object with hundreds of nested properties), then the only call that will be executed utilizing the non-optimized code will be shorter.

The amount of time saved with this approach depends on the library. For example, if you have a big model (more than 20 classes) and you’re using EF Core 3.0, you can shave off several seconds in your first access.

Take a look at the following examples on how to do this for Newtonsoft and Entity Framework:

Example with Newtonsoft

Here is an example of how to initialize Newtonsoft:

var initializationString = JsonConvert.SerializeObject(new { property = 1 });

And you can compare the results using this code, which produces these results on my laptop using Ready To Run.

Example with Entity Framework Core (EF Core)

Using the same idea as before, here is an example on how to initialize EF utilizing the MySQL provider, Pomelo. We just open and close the connection without actually loading any data.

var mySqlServiceRequestOptions = new DbContextOptionsBuilder<MyContextClass>();mySqlServiceRequestOptions.UseMySql(myConnectionString);var ctx = new MyContextClass();ctx.Database.OpenConnection();ctx.Database.CloseConnection();

Host configuration

The primary focus here is on code and code configuration, but I want to take some time to explain how different settings or hosting options can affect cold starts as well.

Operating system in Azure

When creating and configuring an Azure Function, you can choose if you want to use Windows or Linux as your operating system.

My experience is that start times are faster in Windows by 100 or 200ms when compared to Linux. I suggest you try this on your own and see what gives you better results.

Memory configured in AWS

AWS Lambdas scale CPU based on the amount of memory that is configured. This will impact cold start times and are directly impacted by CPU processing power. But there is a downside, configuring more memory incurs a higher price per hour.

Specific runtime selected

You have multiple options to choose from for .NET runtimes:

  • .NET 2.1: I strongly suggest you do not pick this option. .NET 2.1 does not support Ready To Run (R2R), and R2R is our best tool to improve cold start performance.
  • .NET 3.1: Now that AWS supports .NET 3.1 natively, this is the go-to option for both Azure and AWS in most cases. A big caveat is that the AWS framework uses the System.Text.Json library, and if you add Newtonsoft as well, it may add unnecessary extra time.
  • Create your own container or framework: This is the most complicated option but it can definitely bring great results. I explain how to build your own framework for AWS Lambdas in this article.

Dedicated and pre-warmed instances

Having dedicated instances reduces the amount of cold starts, not the time they take. This lowers the average time that a user needs to wait, but when a cold start happens, it takes exactly the same amount as before. The idea is that we have instances that are always on since they will not incur cold starts.

You have to take into account that this will not prevent cold starts. For example, if you have one provisioned concurrency for a particular AWS Lambda, the first user will not face a cold start but the second concurrent user will.

Also, this solution is more costly as we are paying for the time that our resources are not in use.

Conclusion

When you’re working with Serverless Computing, it’s paramount to address cold start times to get a responsive application. If you approach Serverless Computing to create an API in the same way you approach a traditional deployment (eg: A Kubernetes cluster), you’ll probably have inconsistent response times that can annoy users.

The first and obvious solution is to have instances that are always online, but doesn’t this defeat one of the purposes of Serverless Computing? The fact that you only pay for what you actually use? I personally believe this is an intermediate solution — a work around if you will — that can be use while you’re working on actually improving response times and only until you fix them.

When you’re optimizing your application: measure, measure, measure! You’ll need the information on which libraries are using more time to start. Some of the libraries that add a lot of time, can be swapped for libraries that load a lot faster (like Automapper). Lastly, don’t forget about initializing libraries with very small payloads. It may sound counter-intuitive, but it can save A LOT of time.

You can probably grab your current application and reduce your start up times by 70%. Who said you can’t have your cake and eat it too?

Happy coding :)

Glossary

Tiered compilation

https://docs.microsoft.com/en-us/dotnet/core/whats-new/dotnet-core-3-0#tiered-compilation

Run-time compilation settings

https://docs.microsoft.com/en-us/dotnet/core/run-time-config/compilation

Explanation of Ready To Run:

https://docs.microsoft.com/en-us/dotnet/core/whats-new/dotnet-core-3-0#readytorun-images

--

--

Diego Garber
Slalom Build

Solutions Architect for Slalom Build Chicago currently focused on cloud technologies and .Net Core