AWS Lambda Performance Best Practices
A checklist for Cloud Engineers to live by
In this blog, let’s discuss the best practices to optimize AWS Lambda performance and secure peak efficiency for a Lambda deployment. Whether you’re just getting started with AWS or are already experienced, these insights will help you maximize your application's performance while leveraging AWS's cutting-edge capabilities.
✅ Choose an efficient runtime
When comparing cold start times, Python runtimes are the fastest, and Node.js runtimes can be nearly as fast as Python too — while Java can be 3 times slower than Python. To compensate for Java’s poor performance, the cure is to allocate more memory — which can roughly cost 2 times more than that of Python or Node.js. Therefore, even though Lambda supports multiple runtimes, make sure that you pick an efficient runtime like Python or Node.js unless you have a really good reason not to do so. As a rule of thumb, remember that usually interpreted languages (e.g. Node.js, Python) perform better for Lambda even though in some special cases (e.g. for subsequent requests), compiled languages (e.g. Java) can perform better.
✅ Mitigate cold starts
Implement strategies like provisioned concurrency, reserved concurrency, Lambda warmers, proactive initialization, and Lambda SnapStart to minimize the impact of cold starts and reduce latency.
- Provisioned concurrency: This allocates pre-initialized execution environments to your Function, ready for immediate response to incoming requests. However, be aware that configuring this feature incurs additional charges to your AWS account.
- Reserved concurrency: This sets the maximum concurrent instances for your Function. Once configured, no other Function can use that concurrency. Importantly, no additional charges are associated with configuring reserved concurrency for a Function.
- Lambda warmers: This is currently best achieved through the CloudWatch Events “ping” method and involves strategic practices to pre-warm Lambda Functions effectively. The key tips include not pinging more often than every 5 minutes, directly invoking the Lambda Function without calling API Gateway, using a known dummy payload, and crafting handler logic to respond without executing the entire Function. To achieve concurrency, the same Function must be invoked multiple times with delayed executions, preventing the system from reusing the same container and ensuring optimal performance for concurrent Function instances.
- Proactive initialization: For functions using unreserved (on-demand) concurrency, Lambda occasionally pre-initializes execution environments to reduce the number of cold start invocations. This does NOT mean you’ll never have a cold start again. And, it should be noted that the percentage of true cold start initializations to proactive initializations varies depending on many factors, but it’s observable.
- Lambda SnapStart (Java only): With SnapStart, Lambda initializes your function when you publish a function version. Lambda takes a Firecracker microVM snapshot of the memory and disk state of the initialized execution environment, encrypts the snapshot, and caches it for low-latency access. When you invoke the function version for the first time, and as the invocations scale up, Lambda resumes new execution environments from the cached snapshot instead of initializing them from scratch, improving startup latency. Note that SnapStart supports Java 11 and later Java-managed runtimes.
✅ Configure optimal memory
Memory size improves cold start time linearly. Memory size also determines the amount of virtual CPU and other resources available to a function. Therefore, overprovisioning or underprovisioning memory configurations can adversely affect both the performance and the cost of a Lambda Function. Ensure that you analyse the actual memory requirements of your application code and assign only what’s required with a slight buffer (If you visualize and fine-tune the memory/power configuration of Lambda functions, you can try out a tool like aws-lambda-power-tuning).
✅ Optimize Function package sizing
To enhance deployment speed and expedite function initialization, it’s essential to trim the size of your deployment packages. Removing redundant and unused code sections, along with unnecessary dependencies (e.g. remove dev/build/compile-time dependencies and keep only runtime dependencies in the final bundle), not only accelerates deployment but also contributes to faster function startup. Smaller package sizes reduce the download and unpacking time before invocation, particularly crucial for minimizing cold start delays (however, note that in some performance tests, it has been observed that when memory and other resources are configured to higher values, increasing the package size has less impact on overall performance).
For Node.js functions, consider minifying or uglifying your JavaScript code to shrink the package size (using a module bundler library like webpack). This practice significantly decreases the time needed to download the package. In certain instances, the package size could drop by 80–90%, offering substantial improvements in download efficiency.
For functions authored in Java or .NET Core, AWS’s recommendation is to refrain from including the entire AWS SDK library in your deployment package. Instead, selectively depend on modules that encompass the SDK components you need, such as DynamoDB or Amazon S3 SDK modules and Lambda core libraries. Also, improve Lambda’s unpacking speed for Java-authored packages by placing dependency .jar
files in a separate /lib
directory. This approach is more efficient than consolidating all your function’s code into a single jar
with numerous .class
files. Streamline dependencies by opting for simpler frameworks that load swiftly during execution environment startup. For instance, prioritize straightforward Java dependency injection (IoC) frameworks like Dagger or Guice over more complex options like the Spring Framework (if you have a preference for Spring ecosystem, then go for Spring Cloud Functions rather than the Spring Boot web framework).
✅ Implement asynchronous invocations
Utilize asynchronous invocation for scalable processing, enabling the system to efficiently handle varying workloads without waiting for the completion of each function.
# To invoke a function asynchronously,
# set the invocation type parameter to Event.
aws lambda invoke \
--function-name my-function \
--invocation-type Event \
--cli-binary-format raw-in-base64-out \
--payload '{ "key": "value" }' response.json
✅ Configure concurrent executions
Configure the concurrency settings of your functions to control the number of simultaneous executions, especially for critical workflows. This helps prevent throttling due to resource exhaustion, especially in scenarios where multiple invocations can potentially overload the system.
✅ Leverage batch processing
For scenarios involving multiple items, instead of triggering Lambda functions individually for each item, aggregate them into batches (based on counts or task types) first and then pass to the function for processing. This significantly reduces the overhead associated with individual function invocations. This approach is particularly beneficial when dealing with large datasets or repetitive tasks.
✅ Leverage Lambda layers
Leveraging Lambda Layers for shared code (such as common logic, libraries, and dependencies) promotes efficient code reuse across multiple functions. This helps in reducing the size of individual function packages, faster cold starts, and streamlining maintenance (less dependencies, clear versioning and updates).
✅ Implement caching strategies
Implement caching mechanisms for frequently accessed data to reduce the need for repetitive computations and costly database queries. This can significantly enhance function performance, especially in scenarios where certain data remains relatively unchanged (e.g. system configs, reference data, enum values).
✅ Choose an established runtime version
AWS already has its own internal caching mechanisms built into the Lambda platform — however, from time to time, these caching configurations can get adjusted based on varying workloads and usages. Especially, the newer runtimes may take some time to achieve the peak performance caching levels. Therefore, for critical workloads, it is always recommended to pick a stable and mature runtime over a recently announced shiny new runtime.
✅ Leverage stateless and ephemeral design
Designing functions to be stateless (does not maintain a central state and hence won’t share data between requests/sessions) and ephemeral (fit to last for a very short time like 15 minutes) simplify their execution and management. Stateless functions are easier to replicate and scale horizontally, allowing for better performance in scenarios with fluctuating workloads. The absence of a persistent state makes it easy to distribute tasks across multiple instances for streamlined processing without worrying about interdependencies. Also, the ephemeral nature ensures the robustness and reliability of the system by preventing the accumulation of issues over prolonged execution periods.
✅ Share global/static variables, singleton objects, HTTP and DB connections between subsequent invocations
Where it is necessary, make use of global/static variables and singleton objects since it is efficient to keep them alive and share between subsequent invocations until the instance goes down rather than re-initialising them for every request call. Set up HTTP and database connections at the global level to reuse them for future invocations (For instance, in Node.js + MongoDB/DocumentDB case, creating Mongoclient
object at the global level and setting the callbackWaitsForEmptyEventLoop
property on the AWS Lambda Context object to false
allows a Lambda function to reuse the DB connection across function invocations).
✅ Set up auto-scaling
Configuring auto-scaling settings allows your functions to dynamically adjust the number of concurrent executions based on demand. This flexibility ensures optimal resource utilization and responsiveness to varying workloads.
✅ Implement connection pooling
Implementing connection pooling helps in reusing database connections across multiple function invocations, reducing the overhead of establishing new connections each time. This is particularly important for functions with frequent interactions with databases.
✅ Configure timeouts
Setting appropriate timeouts ensures that your functions don’t consume unnecessary resources. It’s essential to strike a balance between allowing sufficient time for execution and avoiding prolonged resource occupation in case of failures. Timeouts force developers to implement other performance best practices like:
- Decouple long-running tasks into async processes,
- Minimize and optimize costly I/O operations,
- Minimize external dependencies and waiting times,
- Implement caching mechanisms, and
- Leverage concurrency.
✅ Optimize logging
Streamlining logging practices by focusing on essential information minimizes the volume of data transferred and stored. This reduces both the associated costs and the impact on function performance.
✅ Optimize invocation payload
Minimizing the size of invocation payloads is essential for reducing network latency through faster data transfers and enhancing the overall efficiency of your functions through faster processing. This is particularly important when dealing with frequent function invocations or scenarios where data transfer and processing times contribute significantly to overall execution time.
✅ Optimize networking
Minimizing external dependencies and optimizing networking configurations contribute to reducing latency in function execution. This is particularly important for functions that rely on external services or data sources.
In terms of network design, note that configuring ENIs (Elastic Network Interfaces) takes a considerable amount of time and contributes to the delay in cold starts — therefore consider sticking to the default network environment unless you require a VPC resource with a private IP (note that AWS has been making substantial improvements in how ENIs connect to Lambda environments with customer VPCs, enhancing overall performance).
If a Lambda operates within a VPC and communicates with an AWS resource, avoid DNS resolution to save time, as it can be quite time-consuming. For instance, if your Lambda function interacts with an Amazon RDS DB instance in your VPC, consider launching the instance with the non-publicly accessible option to streamline the process.
✅ Apply regular updates
Staying updated with AWS Lambda releases and incorporating new features or optimizations is essential for continually enhancing performance. Regular updates ensure that your functions benefit from the latest improvements and stay aligned with best practices in the rapidly evolving serverless landscape.
✅ Conduct thorough testing
Conducting thorough performance testing with realistic workloads is crucial for identifying bottlenecks and optimizing accordingly. This ensures that your functions can handle the expected load efficiently.
✅ Leverage environment variables
Use environment variables for configurable parameters, facilitating easy scaling adjustments and operational improvements without modifying or redeploying code.
✅ Set up monitoring tools to measure performance
If you can’t measure performance, you can’t improve it as well. Simple as that. Monitoring tools like CloudWatch help us promptly detect and address performance issues of Lambda functions and other AWS services natively. Implement monitoring for invocation counts, memory usage, concurrency stats, and usage costs. You can also build graphs and dashboards on CloudWatch with these metrics. Additionally, set alarms to proactively respond to changes in utilization, performance, or error rates.
Lambda sends metric data to CloudWatch in 1-minute intervals. For more immediate insight into your Lambda function, you can create high-resolution custom metrics as well.
Read more about CloudWatch support for monitoring Lambda metrics:
> Monitoring functions on the Lambda console: https://docs.aws.amazon.com/lambda/latest/dg/monitoring-functions-access-metrics.html
> Working with Lambda function metrics: https://docs.aws.amazon.com/lambda/latest/dg/monitoring-metrics.html
> Using Lambda Insights in Amazon CloudWatch: https://docs.aws.amazon.com/lambda/latest/dg/monitoring-insights.html
Conclusion
By incorporating these performance best practices and techniques, you can elevate the speed and efficiency of your Lambda applications to a whole new level. Over the past years, the AWS team has been super innovative and has released a plethora of new features to optimize resource utilization, reduce latency, and ensure a seamless user experience on the Lambda ecosystem. With continuous learning and a proactive approach to implementing AWS Lambda performance best practices, you can maximize the value of your AWS investment and achieve your business objectives.
Stay tuned for the next AWS tip. Until then, happy coding!