“Where do you store files without a server?”
…is the most common question I get asked during Q&A after one of my ”Introduction to Serverless Platforms” conference talks. Searching for this question online, this is the answer you will often find.
“Use an object store for file storage and access using the S3-compatible interface. Provide direct access to files by making buckets public and return pre-signed URLs for uploading content. Easy, right?”
Responding to people with this information often leads to the following response:
Developers who are not familiar with cloud platforms, can often understand the benefits and concepts behind serverless, but don’t know the other cloud services needed to replicate application services from traditional (or server-full) architectures.
In this blog post, I want to explain why we do not use the file system for files in serverless applications and introduce the cloud services used to handle this.
Serverless Runtime File Systems
Serverless runtimes do provide access to a filesystem with a (small) amount of ephemeral storage.
Serverless application deployment packages are extracted into this filesystem prior to execution. Uploading files into the environment relies on them being included within the application package. Serverless functions can read, modify and create files within this local file system.
These temporary file systems come with the following restrictions…
- Maximum application package size limits additional files that can be uploaded.
- Serverless platforms usually limit total usable space to around 512MB.
- Modifications to the file system are lost once the environment is not used for further invocations.
- Concurrent executions of the same function use independent runtime environments and do not share filesystem storage.
- There is no access to these temporary file systems outside the runtime environment.
All these limitations make the file system provided by serverless platforms unsuitable as a scalable storage solution for serverless applications.
So, what is the alternative?
Object stores manage data as objects, as opposed to other storage architectures like file systems which manage data as a file hierarchy. Object-storage systems allow retention of massive amounts of unstructured data, with simple retrieval and search capabilities.
Object stores provide “storage-as-a-service” solutions for cloud applications.
These services are used for file storage within serverless applications.
Unlike traditional block storage devices, data objects in object storage services are organised using flat hierarchies of containers, known as ”buckets”. Objects within buckets are identified by unique identifiers, known as ”keys”. Metadata can also be stored alongside data objects for additional context.
Object stores provide simple access to files by applications, rather than users.
Advantages Of An Object Store
scalable and elastic storage
Rather than having a disk drive, with a fixed amount of storage, object stores provide scalable and elastic storage for data objects. Users are charged based upon the amount of data stored, API requests and bandwidth used. Object stores are built to scale as storage needs grow towards the petabyte range.
simple http access
Object stores provide a HTTP-based API endpoint to interact with the data objects.
Rather than using a standard library methods to access the file system, which translates into system calls to the operating system, files are available over a standard HTTP endpoint.
Client libraries provide a simple interface for interacting with the remote endpoints.
expose direct access to files
Files stored in object storage can be made publicly accessible. Client applications can access files directly without needing to use an application backend as a proxy.
Special URLs can also be generated to provide temporary access to files for external clients. Clients can even use these URLs to directly upload and modify files. URLs are set to expire after a fixed amount of time.
IBM Cloud Object Storage
Buckets’ contents can be stored with the following automatic data resiliency choices.
- Cross Region. Store data across three regions within a geographic area.
- Regional. Store data in multiple data centres within a single geographic region.
- Single Data Centre. Store data across multiple devices in a single data centre.
Cross Region is the best choice for ”regional concurrent access and highest availability”. Regional is used for “high availability and performance”. Single Data Centre is appropriate when “when data locality matters most”.
Data access patterns can be used to save costs by choosing the appropriate storage class for data storage.
IBM Cloud Object Storage offers the following storage classes: Standard, Vault, Cold Vault, Flex.
Standard class is used for workloads with frequent data access. Vault and Cold Vault are used with infrequent data retrieval and data archiving workloads. Flex is a mixed storage class for workloads where access patterns are more difficult to predict.
Storage class and data resiliency options are used to calculate the cost of service usage.
Storage is charged based upon the amount of data storage used, operational requests (GET, POST, PUT…) and outgoing public bandwidth.
Storage classes affect the price of data retrieval operations and storage costs. Storage classes used for archiving, e.g. cold vault, charge less for data storage and more for operational requests. Storage classes used for frequency access, e.g. standard, charge more for data storage and less for operational requests.
Higher resiliency data storage is more expensive than lower resiliency storage.
IBM Cloud Object Storage provides a generous free tier (25GB storage per month, 5GB public bandwidth) for Lite account users. IBM Cloud Lite accounts provide perpetual access to a free set of IBM Cloud resources. Lite accounts do not expire after a time period or need a credit card to sign up.
Serving files from serverless runtimes is often accomplished using object storage services.
Object stores provide a scalable and cost-effective service for managing files without using storage infrastructure directly. Storing files in an object store provides simple access from serverless runtimes and even allows the files to be made directly accessible to end users.
In the next blog posts, I’m going to show you how to set up IBM Cloud Object Storage and access files from serverless applications on IBM Cloud Functions. I’ll be demonstrating this approach for both the Node.js and Swift runtimes.
Originally published at jthomas.github.com on April 27, 2018.