The evolution from servers to functions

Peer through the lens of a developer’s most common decision criteria — time and money

Mayank Lahiri
A Cloud Guru
12 min readSep 11, 2017

--

Serverless architecture is the newest cloud computing paradigm that promises to lower overall development and operational costs — with a combination of new technologies and pricing structures.

While the architecture is quickly evolving, serverless lacks a standard definition and general consensus within the technical community.

The term serverless is used liberally to label everything from commercial products, open-source projects, to aspects of the underlying architecture. It doesn’t help that the term falsely implies the absence of servers.

Before trying to define serverless, it’s important to understand how we even arrived at this inflection point of modern architecture. To trace the evolution of servers to serverless, let’s peer through the lens of a developer’s most common decision criteria— time and money.

Rack ‘em and Stack ‘em

In the beginning, there were servers.

Remember servers? Those large, flat, physical boxes with expensive, industrial-grade components inside? They were generally placed inside data centers with redundant power supplies and high-speed networks.

The server was installed in a server rack, powered up, and connected to the network. After the initial install of the operating system and patches, the server needed to configured with web servers, databases, and caches.

Once properly configured, the application code would finally be copied to each server. The end of the setup was just the beginning — servers would have to be monitored, patches continuously applied, and new servers added when application usage increased or servers failed.

The effort involved in administering a server far exceeds the time required to develop an application — and represents an ongoing cost for the duration of the application’s lifespan. The only rationale driver and advantage for installing physical servers is retaining absolute control over the hardware.

Today, this model of running servers has been streamlined into commercial forms such as server colocation, bare metal hosting, and dedicated servers. Each of the models vary in the degree of manual labor required, method of hardware procurement, and pricing model — but they are all still best suited for those who need absolute control over their own hardware.

Generally speaking, this approach comprises large organizations with dedicated I.T. staff and specialized needs.

Examples: Rackspace Managed Colocation, Joe’s Data Center Colocation, Rackspace OnMetal, Scaleway

It’s virtually a server

With the advent of virtualization technology, and particularly hardware support for it baked into CPUs around 2005, a physical server could be efficiently split into multiple, smaller virtual servers — or virtual machines.

Each virtual server could run its own separate copy of a standard operating system like Linux, Windows, or FreeBSD. Each instance is completed isolated from the other virtual servers running on the same physical machine.

An expensive $10,000 server that was over-provisioned could now be split into five smaller $2,o00 servers — each running a different operating system, with no modifications required to any existing software.

Virtualization works by having an additional hardware-assisted layer of software called a hypervisor that sits below a traditional operating system. The purpose of the hypervisor is to effectively emulate a physical server, with or without the cooperation of the operating system.

Virtualization had an interesting side effect: since the hypervisor acts an intermediary between a virtual machine and the actual hardware, a running virtual machine could be “frozen” at any time into a system snapshot — essentially creating a large snapshot file.

The snapshot could be copied to a different server, and an exact clone of the virtual machine could be restored. The snapshot could comprehensively include everything from the running state of programs and the contents of allocated memory, or simply capture an image of the “hard disk”.

A fleet of virtual servers

Commercial hosting providers quickly began to offer virtual private servers, or VPSs, as an alternative to running a dedicated server. The vendors would install and configure popular operating system like Linux or Windows onto a virtual machines, and then take a snapshot to obtain a master copy.

These snapshots could then be copied to any number of physical servers, and restored to a fully-functioning virtual machine in a matter of minutes or seconds. Since this process could be repeated any number of times, large fleets of virtual servers running identical software could be created and destroyed quickly — rather than setting up physical servers individually.

By leveraging virtual machines and economies of scale, the providers were also able to make better use of their fleet of physical servers.

With virtual machines, a single server could be efficiently split into a mix of various sized servers — with each possibly leased out to a different developer. The mix of virtual servers could be altered at any time allowing a wider range of server sizes to be offered to developers, while enabling more efficient utilization of their inventory of physical servers.

The use of VPSs providers developers with a few new advantages:

  • Scale. The server operating system only had to be configured once, and the frozen state could then be re-used to quickly create many, identical virtual servers.
  • Reliability. If the physical server failed for any reason, the hosting provider could automatically re-create the VPS on a different physical server and automatically switch the old IP address over to the new VPS.
  • Flexbility. The availability of smaller and cheaper instances increased, as well as the flexibility to mix with larger instances to optimize cost vs. performance needs.
  • Cost Control. Since a VPS could be easily created and destroyed , providers generally billed by the hour or month instead of year. Large fleets could be created for a few hours, used, and then destroyed without incurring the cost of an annual server lease.

Virtualization had a massive impact on the economics of running fleets of servers. The technology lead to the popularity of cheap, managed installations of various software frameworks like Wordpress, as well as the now-ubiquitous $5 per month “pay as you go” virtual server.

Examples: Google Compute Engine, Amazon EC2, Linode, DigitalOcean

Contain yourself

In 2007, a new feature was contributed by Google to the Linux kernel that offered a way to effectively replicate a limited form of virtualization within the Linux operating system itself.

The primary purpose of the feature was to safely bundle a Linux program and all its dependencies into a container image. The image could then be cloned and run on other Linux machines — with a degree of isolation from other containers running on the same machine.

Although this sounds very similar to virtualization, there are a number of important technical differences between virtual machines and containers.

One key difference between containers and virtual machines is that virtual machines run an operating system on top of a hypervisor, but containers run programs on top of an operating system.

Containers vs Virtual Machines, courtesy of Docker Inc. and RightScale Inc.

One of the drawbacks of a starting a virtual machine is that the process involves an entire operating system boot process, or requires the restoration of running system state from a frozen snapshot.

Both of these procedures can take on the order of minutes to complete. After the startup sequence completes — running a virtual machine consumes an entire operating system with all the associated CPU and memory overhead.

Containers, on the other hand, run within a single host operating system. The program being executed only consumes the CPU and memory overhead within the container. When a container is started, the application program inside it starts like a regular program — but it’s restricted by the operating system to run in isolation on only a slice of overall system resources.

This feature leads to two interesting properties of containers relative to virtual machines:

  • Lower resource usage. The memory and CPU overhead of a container is far less than that of a virtual machine. Containers do not have an entire operating system worth of background processes, device drivers, and other paraphernalia running alongside the developer’s programs.
  • Faster startup. Virtual machines must be either booted through a standard operating system boot procedure, or restored from a suspended state. A container, on the other hand, starts with latency that is comparable to double-clicking a program on your desktop, while offering many of the benefits of virtual machines.

Since containers consume less memory and fewer CPU cycles than entire virtual machines, providers could utilize their fleet of physical servers even more efficiently. So instead of packing ten virtual machines to a single server, they could now pack fifty containers onto the same machine.

Providers could now offer a new tradeoff available for developers — allowing them to run applications instead of a container instead of a virtual machine. Containers provided a great option assuming the developer didn’t need to interact with system-level components, could express their application and all its dependencies as a file system, and it ran on the handful of operating systems that supported containers.

With containers, developers still had many of the benefits of virtual machines, such as installing custom system libraries and modifying many operating system settings — but with a faster deployment process than virtual machines.

Container images were generally small, faster to start, easy to test, and could be quickly created by scripts and tested locally. Due to the use of layered filesystems by container runtimes, images could be built, updated, and uploaded much faster than virtual machine images.

However, one thing that didn’t change was the billing method — providers ran containers continuously and charged for the number of hours that each container executed. The billing would occur even if the container consisted of an idle program that did no work, matching the experience of renting a VPS by the hour.

Examples: Docker Engine, LXD, Mesos, Kubernetes, AWS ElasticBeanstalk

Platform as a Service (PaaS)

At some point, providers realized that many developers were using similarly configured containers to run their applications. A typical web-facing setup might contain a web server and a web application runtime and framework, like Ruby on Rails or PHP, along with the developer’s application code.

These containers would typically be served as an HTTP-based API, allowing easy access from web frontends, native applications, and desktop applications. Of all the components and auxiliary software that went into the “stack” of software, the only one that most developers truly cared about was their own application code — or their business logic.

If the remaining components could be managed by the provider— such as the web server, language runtime, system libraries — then the developer could be free to focus on writing just their application code. The provider would be responsible for setting up a largely standard OS environment for the developer. In return, the developer would get a consistent environment that reduced development time as well as maintenance costs.

In order to achieve the goal of focusing on only the business logic that differentiates your product, then everything else should be abstracted into a commoditized service. The developer must learn to relinquish control of everything else to the provider, counting on the provider to create a standard environment in which to run the developer’s code.

The provider takes on all the tasks associated with maintaining and running a scalable web service — from provisioning virtual or physical servers, to configuring the operating system and server software used, to determining the exact version of a language’s runtime. Since the developer no longer has to concern themselves with these tasks, the size of the code required to bring their application live is drastically reduced.

While this approach lead to the evolution of the Platform-as-a-Service, where continuously running containers could be dynamically created and destroyed by developers, there were still further cost saving to be gained.

Examples: Heroku, Meteor Galaxy

Functions as a Service (FaaS)

Due to the small size and execution times of containers, providers soon realized they could run the developer’s application when a request was received — and then immediately shut it down to save resources.

Instead of keeping a copy of the developer’s code running continuously at all times and incurring billing charges, providers could now wait for a user request — and only then create a container with the developer’s code to service the request. After the developer’s code responds to the request, the container would be destroyed freeing up system resources for other requests.

This leads to one of the defining characteristics of the serverless paradigm: short-lived, container environments that are created to service individual requests and other events. These events can be HTTP requests, WebSocket connections, work queue items, and database notifications, among many other possible triggers.

From the provider’s point of view, requests for multiple developers’ applications can be efficiently distributed across a large fleet of web servers. Each container is short-lived — -and only consumes resources for the duration of servicing a request. The provider’s fleet can be utilized even more efficiently than having continuously-running containers, which consume memory and CPU resources even when idle.

Since individual requests can be routed to any physical machine in the fleet, providers were able to offer developers the holy grail of scaling — instant, massive parallelism — without any service degradation for large, sudden spikes in traffic (e.g., after a Super Bowl advertisement).

For developers, the serverless experience offers many other perks as well, in addition to the instant parallelism.

  • Simpler code: instead of writing long-lived web serving programs, the serverless paradigm encourages small, stateless programs — or functions — that can be created and destroyed on demand. Instead of keeping track of every client connected to the server, the developer’s code can now assume that it is created to communicate with exactly one client.
  • No provisioning: instead of having to determine the performance capacity needs for an application before it’s launched into production, developers can now deploy a serverless application without worrying about the number of servers or instances to reserve.
  • No paying for idle time: With a serverless application, developer code is only executed in response to trigger events, on an “invisible” fleet of actual servers hidden from the developer’s view. The developer has zero control over how many servers or instances they can reserve — so the only sensible billable unit is the total amount of time actually spent by the provider servicing all the incoming events.

A major tenet for writing serverless code is to express business logic as programs that are created to serve a single trigger event — and only that event. After the program finishes handling the trigger event, it is destroyed.

This approach to serverless architecture is in stark contrast to traditional client/server programming, where a single program with multiple threads may accept and respond to many simultaneous HTTP connections using conveniences like shared memory between requests.

As a result, there is a new onus on developers to write or rewrite their applications using a different paradigm. In many cases, however, this rewrite can actually reduce the amount of code required for a particular application.

Examples: AWS Lambda, Google Cloud Functions, Azure Functions.

Serverless services

While serving websites is currently the largest use case, the serverless paradigm extends well beyond this common pattern. Providers are already offering serverless data warehouses, serverless databases, and serverless stream processing systems.

In the future, we can expect this paradigm to quickly extend to other classes of products — since the benefits for developers are so glaringly obvious. To help determine if a product or service is “serverless” — you can often just look look at the pricing page for the following properties:

  • No provisioning: If it requires explicit capacity provisioning — it is likely not a serverless platform. The product or service should not require the developer to reserve an explicit number of instances, servers, or containers. A serverless platform increases capacity as and when needed, keeping the details transparent from the user.
  • Bill by usage: If a product or service bills by the time an application is kept available for use, rather than for the actual usage of the application — then it is likely not a serverless platform. A quick litmus test is to ensure that an idle application will result in a negligible bill at the end of the month.

TL; DR — check out the summary table

The following table breaks down the product categories based on how much of the stack is managed by the developer. Each row represents a slice of the stack — from the first line of application code at the top, to the physical data center that hosts the server ultimately running the application.

Thanks for reading!

Many thanks to Oliver Bowen, Drew Firment, Merhawi Redda Tamrat, Kapil Thadani, and Ishaan Joshi for their contributions to this post.

--

--

Mayank Lahiri
A Cloud Guru

Former data mining academic, ex-engineering at Google, Facebook, AWS, currently Software Architect @ Oracle. @MayankLahiri