AWS Lambda brings new use cases with increased ephemeral storage

Mark Faiers
Contino Engineering
6 min readMar 24, 2022

--

by Mark Faiers & Brett Lester

Introduction

In this blog we’ll introduce you to the improvement AWS has recently made to the Lambda service, increasing the ephemeral storage capacity from 512MB to 10GB.

We’ll discuss common Lambda use cases, discuss the change being made, the additional storage, and the practical steps you can take to get started now.

Lambda

AWS Lambda Logo

Released in 2014, AWS Lambda is a serverless Function-as-a-Service (FaaS) platform that allows you to shift the management of both the physical, and virtual server components of your application, and the container layer, to AWS. This allows you to focus on the things that generate business value, rather than setting up and managing infrastructure.

Lambda supports most modern programming languages, has deep integration with many other AWS services, and scales almost infinitely.

Common Lambda Use Cases

Since its inception use of Lambda has grown enormously, and more and more use cases have been identified. The most common of these, that we see a lot at Contino, are:

  • Application Backends — one of the most common use cases for Lambda functions is to provide some functionality to process API calls. Usually this means integration with API Gateway so that when an endpoint is called, the Lambda function is invoked, processes the request, and returns a response to the user. This is most commonly used in the context of an application where a single Lambda function may map one-to-one to the API endpoints.
  • File Processing — being an event-driven service Lambda can be invoked when a file is uploaded, updated, or moved. This allows for a wide range of applications, for example scanning files or packages in an S3 bucket for viruses before they are used, to help enhance application and platform security.
  • Automating regular tasks — Lambda functions are invoked by ‘events’, which can be configured to occur at regular intervals using cron expressions. For example the Lambda function could be configured to be invoked once per day, one per hour, or every Tuesday and Wednesday. This is useful for automating tasks that need to be carried out on a regular basis, such as creating backups, or shutting down/spinning up test environments.
  • Stream processing — being extremely scalable make Lambda an ideal tool for processing stream data. Additionally, its tight integration with Kinesis also makes such a solution easy to set up and manage.

What is Changing?

AWS are constantly looking at feedback from their customers, one of their well-known company-wide principles is ‘customer obsession’, and ways that they can improve their services. As such they are increasing the ephemeral storage of Lambda functions (currently 512MB) to a maximum of 10GB, and allowing the amount of storage to be configurable by the user.

Whilst this may not make a difference for all use cases, such as those that provide request-response functionality it will significantly improve the viability of Lambda for doing things like processing large files.

We discussed file processing earlier, in the context of scanning S3 objects for viruses/malware, and this is something that we, as Contino, have implemented for some of our clients. Increased ephemeral storage will now make it considerably easier to do this.

Previously, files larger than 512MB would need to be broken apart, or ‘chunked’, in order to scan each part. This adds complexity and increases the chances that a virus/piece of malware would not be detected. By utilising the increased ephemeral storage, it will no longer be necessary to chunk the files in most cases, and will allow for easier understanding of the solution, and improved detection of malware.

This is far from the only use case the extra ephemeral storage will enable however and we at Contino will be looking out for other opportunities to put it to use.

Important Considerations and Limitations

Timeouts — although ephemeral storage is increasing, the maximum length of time a Lambda function can run for is remaining at 15 minutes. Handling large files is typically a time-consuming process so you should test that your function is likely to complete consistently within this timeframe before deploying into production. You should also be able to handle scenarios where it can’t, for example by alerting on failures.

Cost — if you are performing compute or memory intensive tasks then you’ll very likely need to increase the memory/compute provided to each function invocation in order to optimise the job it performs. With this, of course, comes additional cost, and is something to be aware of, particularly if you are running a lot of invocations of the function. Measuring and tracking the costs of running your workloads so that you can continuously optimise is always recommended, and Lambda is no different.

Ephemeral Storage — the clue is in the name, the storage is temporary, and cannot be accessed outside of the the context of a Lambda function invocation. So, if you need more persistent storage, you will still need to use something like S3, or EFS.

How to get started — Using the AWS Console

You can get started using this increased storage by either creating a new Lambda function, or editing an existing one.

In the console you can do this by navigating to the Lambda service, selecting a function, and then the ‘Configuration’ tab. From there you can choose to ‘Edit’ the general configuration of the function (see below):

On the configuration page that appears you can then choose how much ephemeral storage to use for the function:

Simply enter a value, save the function, and the increased storage will be applied. You can then start using the increased storage available in the /tmp folder.

How to get started — Using CloudFormation

If you use CloudFormation to provision and configure your AWS resources you can update the ephemeral storage configuration of your functions by supplying configuration for the `EphemeralStorage` object in your CloudFormation code. The configuration specification can be found in the Lambda CloudFormation documentation. The object only requires one property, `Size`, which is specified in Megabytes (MB) and so must contain a value between 512 and 10240, so your configuration will look something like the following:

MyFunction:
Type: AWS::Lambda::Function
Properties:
EphemeralStorage:
Size: 10240

You can then use CloudFormation to provision or update your Lambda function in the same manner as you currently do.

Key Takeaways

Lambda has more use cases than ever — As we have shown in this post, the addition of extra ephemeral storage for AWS Lambda widens the use cases for Lambda. Where you may have been better off using EC2 or ECS before, you can now avoid the overhead of managing virtual servers and other resources.

There is no silver bullet — it is important to note that while the use cases for Lambda are seemingly ever growing, it is not a silver bullet, and is not suitable for all use cases. This is particularly true if long-running processes are required. Lambda has a 15 minute timeout which must be taken into account if you aim to use it.

It is very easy to configure extra ephemeral storage — Whether you’re using the Console, or CloudFormation adding extra ephemeral storage is as easy as adding a small piece of extra configuration and it is usable right away.

And Finally…

If you would like to find out more about how AWS Lambda can help to empower your organisation and deliver on your transformation objectives come and speak to us at Contino, or the authors, Mark Faiers and Brett Lester.

--

--