Neiman Marcus Tech
Published in

Neiman Marcus Tech

Serverless Provisioned Concurrency Autoscaling

Neiman Marcus open sources’ its first Serverless plugin: Provisioned Concurrency Autoscaling. In this blog I will detail some of the components of the plugin, and why this plugin was created.

Please check out the code on Github, or NPM.


Overcoming function cold starts has always been one of the biggest hindrances to AWS’s Lambda. Lambda runs on demand, decoupling code from infrastructure. This is the nature of the service. Only running code when needed. Advantages of this approach include, lower cost, no need to manage infrastructure, decoupling applications, and individually scaling out or in services as needed.

However, slow cold starts can be a critical issue for applications that require even faster execution of function code.

Previously, developers would have to implement additional software to keep lambda execution environments warm and always ready to execute. This has been less than ideal, with inaccurate performance.

However in December 2019 AWS announced provisioned concurrency. This change has been a huge win for developers that want to improve their execution times, keeping instances of the execution environment available, waiting to perform its work. Even better, it was supported on day one with Serverless.

Scaling Provisioned Concurrency

This update also introduced an autoscaling application policy and target for scaling provisioned concurrency. With these resources, developers can scale provisioned concurrency on a function based on a schedule, or by tracking a utilization metric. Very cool since utilization fluctuates during different parts of the day.

Serverless Support

Serverless is a big win for developers as it aims to simplify configuration and deployment. Creating configuration is often resolved with a few lines of yaml. What we don’t see behind the scenes, and thankfully so, is the great amount of work the tooling performs to create Cloudformation resources.

Provisioned concurrency is no different. A simple configuration for provisioned concurrency turns from this…

… to this:

What a tremendous simplification!

Configuration Issues

Issues arise when creating the autoscaling application policy and target. The advantage of using Serverless is to simplify the yaml configuration, preventing yaml hell. Unfortunately Serverless does not natively support application autoscaling resources, apart from basic Cloudformation. Now let’s look at adding these Cloudformation resources to the project:

This is a lot of opinionated configuration, which is difficult to work with. Even worse, there is no way to know some of the configuration items until the package is built for deployment. This would lead to the developer locally packaging the files and peaking at its generated Cloudformation.

Finally, for each function with provisioned concurrency, these 20+ lines of yaml would have to be copy and pasted over and over.

Enter the Plugin

Fortunately, Serverless is extensible and has a robust plugin ecosystem to help with one off features and functionality. This is genius as it allows the community to fill a gap or need, where the core functionality might be lacking. This is the case with the new Provisioned Concurrency Autoscaling plugin!

The plugin generates these Cloudformation resources with a simplified configuration, keeping with the spirit of the tooling. So, we go from the example above to…


Much simpler. But what about specific configuration? There is a way for that also…

Notice the example has a function with partial configuration and one with full configuration. If configuration is omitted, defaults are substituted.


In this blog, I briefly discussed the new serverless plugin for provisioned concurrency autoscaling, and how it improves the lives of developers seeking to reduce execution times in their applications, and Cloudformation resources. Please continue to our Github page for more, and keep an eye on other open source contributions from Neiman Marcus in the future.

Thank you!