Azure Functions scaling and cold start time
Cold start time is always a big thing for serverless applications, but what is it exactly? To understand this, we first have to take a look how a serverless functions app is executed in Azure at all.
Deployment Unit
A function app is the unit of deployment and management. It can contain one or a set of different functions in the same language. They are managed, deployed and also scaled together. That means that all your functions in one app can only be scaled as one unit. In the consumption plan, they can also be scaled to zero. The scaling works a little bit different between the different payment plans. In this article we only take the consumption (and premium) plan into consideration.
Scaling
The number of running instances is controlled by a scale controller in Azure. It monitors the rate of events and decides with heuristics, for each trigger type, if it should scale instances in or out.
If zero functions are running the number of instances is scaled to zero.
Cold Start Time
If a new functions app is started, it always starts from the cold.
The cold start time is the time, the functions app (or other applications) needs to start if it is not running, until it can accept events. This is especially relevant if the functions app is scaled to zero.
So more precisely, a cold start is an increase in latency for Functions which haven’t been called recently. [1]
For Azure functions this is mainly relevant for apps running with the consumption plan (can be scaled to zero). Apps with a dedicated plan have always at least one running instance.
Cold function host instance is started
If a functions app is not running, it has to be started from the scratch. This is also named cold start.
Functions apps run on worker environments in Azure. Azure always holds a pool of free warm unspecialized workers with the function runtime on standby. They are used to run cold function apps.
When the app is not running, Azure first allocates one of these servers to the app. Then the worker becomes specialized by mounting the source files and applying app specific settings (defined in the host.json).
In the next step the function.json files, from each function are read to determine and load the required extensions (e.g. the bindings). Then the functions are loaded into the memory. The required time for this depends on the size of your app. Afterwards the app is ready to handle incoming events.
Only if the next call happens within minutes after the last call, the warm app can be reused. Otherwise it will be scaled out again, if the instance is idle for a couple of minutes. A warm app can directly handle the events.
Improve Cold Start Time
To improve cold start time you should try to keep the size of your resources small. One way to archive this is, by only adding the absolutely required dependencies. Another way is to split the app into different functions apps, which are scaled independently. There are no hard rules when it is better to split the app. This depends a lot on the actions inside the functions and the expected load.
The source file download (to the vm) of the app can be improved by using the feature to execute them directly from a zip package. When it is run directly from the zip package, the code is mounted directly into the root folder, instead of copied. This is mostly relevant for applications with large dependency trees, like function apps written in JavaScript.
Sources
- https://azure.microsoft.com/en-us/blog/understanding-serverless-cold-start/
- https://docs.microsoft.com/en-us/azure/azure-functions/event-driven-scaling
- https://docs.microsoft.com/en-us/azure/azure-functions/functions-scale
- https://docs.microsoft.com/en-us/azure/azure-functions/functions-versions?tabs=in-process%2Cv4&pivots=programming-language-java
- https://docs.microsoft.com/en-us/azure/azure-functions/performance-reliability#scalability-best-practices
- https://docs.microsoft.com/en-us/azure/app-service/deploy-run-package