Warehouse Computing and the Evolution of the Datacenter: A Layman’s Guide

Published in

Lenny for your Thoughts

6 min readJun 2, 2015

You may not have noticed, but we’re in the midst of another massive platform shift in enterprise computing. We can debate chicken or egg, but I believe this most recent transformation is being driven primarily by requirements placed on modern applications; requirements that are the result of the on-demand, always-on computing paradigm predicated by cloud and mobile. Simply, applications need to be scalable, available and performant enough to reach millions, if not billions, of connected devices and end-users. Infrastructure must mirror these specifications, in kind.

Historically, systems design has ebbed and flowed between periods of aggregation (centralized) and disaggregation (distributed) of compute resources. The most recent evolution, from client/server to virtualized, cloud infrastructure was driven largely by a desire to contain costs and consolidate IT around standards (x86 instruction set, Windows and Linux) form factors (first blade servers, then VMs) and physical locations (emergence of sprawling datacenters and giant cloud vendors). Now we’re seeing the pendulum swing back. Why?

A strong first principle is the notion that infrastructure is beholden to the application. Today, many applications are being built as large-scale distributed systems, composed of dozens (or even thousands) of services running across many physical and virtual machines and often across multiple datacenters. In this paradigm, virtualization — which really dealt with the problem of low physical server utilization — doesn’t make much sense. In a highly distributed, service-oriented world, VMs come with too much overhead (read more on this here). Instead of slicing and dicing compute, network and storage, the better solution becomes to aggregate all machines and present them to the application as a pool of programmable resources with hardware-agnostic software that manages isolation, resource allocation, scheduling, orchestration etc. In this world, the datacenter becomes one giant, warehouse computer controlled by a software brain.

However, the fact of the matter is that building, deploying and maintaining distributed applications is a highly technical feat. It requires a rethinking of the way applications treat and interact with other applications, databases, storage and network. Moreover, it requires a new toolkit that is central to solving the coordination and orchestration challenges of running systems that span across multiple machines, datacenters and time zones. To help understand what’s taking place, let’s deconstruct this new stack and, along the way, define some other key terms. Note that this is in no way a static, absolute taxonomy, but rather a simplified way to understand the layers that make up today’s application stack.

Physical Infrastructure — Actual servers, switches, routers and storage arrays that occupy the datacenter. This area was dominated by legacy OEMs (EMC, Cisco, HP, IBM, Dell) who are now giving way to low-cost ‘whitebox’ ODMs.

Vendors/Products: Quanta Computer, SuperMicro, Wistron

Virtualized Infrastructure — Emulated physical compute, network and storage resources that are the basis for cloud-based architectures. The enabling technology here is the hypervisor which sits on top bare metal infrastructure and creates virtual clones of the server (or switch or storage array) each complete with a full OS with its own memory management, device drivers, daemons, etc.

Vendors/Products: Amazon Web Services, Google Cloud Platform, Microsoft Azure, VMware, OpenStack

Operating System — Often stripped down version of host or guest OS that sits atop a virtual or physical host. The rise of Linux has been a key catalyst for the commoditization the OS and physical infrastructure, further decoupling applications from hardware. Microsoft with Windows Server is still a dominant player in traditional enterprise.

Vendors/Products: CoreOS, Project Atomic (Red Hat), Snappy (Ubuntu), RancherOS, Windows Nano Server

Container Engine — This is where it starts to get interesting so let’s spend a little more time here. Linux containers offer a form of operating system-level virtualization, where the kernel of an OS allows for multiple user space instances. More simply, if hypervisor-based virtualization abstracted physical resources to create multiple server clones each with their own OS, the type of virtualization enabled by containers is a higher level abstraction of the OS, allowing appropriate levels of resource utilization and isolation to run multiple applications on a single kernel. The beauty of containers lies in the idea of “code once, run anywhere.” A container holds the application logic and all of its dependencies, running as an isolated process. What’s special about this is it ultimately doesn’t matter what’s inside the container (files, frameworks, dependences); it will still execute the same way in any environment — from laptop, to testing, to production across any cloud, at least theoretically. This enables application portability, which, in turn, commoditizes cloud infrastructure altogether.

Docker has become synonymous with containerization by making Linux Containers (LXC) user-friendly. The important thing to note is that container technology is made up of two fundamental components: the runtime or container engine and the container image format. The runtime is effectively a high-level API that runs processes and manages isolation. The image format is a specification for a standard composable unit for containers. In recent months we’ve seen several container runtimes and specs come to market which has caused a stir. I’m sure we’ll continue to see more.

Vendors/Products: Docker Engine, CoreOS rkt, Open Container Initiative

Service Discovery — Service discovery tools manage how processes and services in a cluster can find and talk to one another. This becomes increasingly important as application are run as collections of highly distributed disparate services. DNS can be used as solution here but suffers at scale, so many have no defaulted to building on top of highly consistent key-value storess.

Vendors/Products: etcd (CoreOS), Consul (HashiCorp), Zookeeper (Apache)

Scheduling & Orchestration — Schedulers interface with the resources of the cluster and are responsible for providing a consistent way to intelligently place tasks based on those resources. Additional tools in this area are responsible to defining and declaring infrastructure which will run tasks.

Vendors/Products: Kubernetes, Mesos (Mesosphere), Serf + Terraform (Hashicorp), Machine + Compose + Swarm (Docker)

Workflow, Monitoring & Management — Tools that automate the deployment lifecycle and management of applications and infrastructure. What some refer to as the management plane, these tools enables devs sysadmins to deploy and maintain applications across computer clusters. This area is largely greenfield as tools are being adapted to work with containers, not just VMs or physical nodes, across distributed environments. Products in this area include everything from automation and congif management frameworks, deployment pipelines to out-of-the-box PaaSs.

Vendors/Products: Chef, Puppet, Ansible, CloudFoundry, Flynn, CircleCI, TravisCI, Glider Labs, SignalFx, SysDig and dozens more.

A few other helpful definitions:

Distributed System — A computing system consisting of a collection of autonomous nodes connected through a network and software/middleware which enables nodes to coordinate tasks and share resources of the entire system. The principle of distributed computing has been around for decades but only recently has it entered into mainstream IT as traditional software architecture has been pushed to its limits at Web scale. Perhaps the best known example is Apache Hadoop, an open-source data storage and processing framework where jobs are split and run across multiple commodity servers.

Microservices — Microservice architecture is a way of designing software applications as sets of modular, self-contained, deployable services. Whereas historically applications would be split into client-side, server-side/logic and database, the idea with microservices is to develop each application as a suite of smaller, modular services each running its own process with a minimal amount of centralized management. Microservices architecture is appealing because it enables greater agility (entire applications don’t need be taken down during change cycles), speed-to-market and code manageability.

(Source: http://martinfowler.com)

The application stack is an ever-evolving, dynamic organism. Ultimately, whether it’s microservices, distributed systems, or containers, the changes we’re seeing at both the code and infrastructure level are about one thing: delivering better, more scalable software faster and cheaper. As a result, today we find ourselves at the outset of, what I’ll call, the warehouse computing era, defined by cheap, commodity infrastructure presented to the application as a pool of dynamically programmable resources with intelligent, hardware-agnostic software as the control plane operating at n-scale.

To me, though, this is still an incremental step. The world of IT I envision is one where code is written once and is executed anywhere, automatically at scale, irrespective of the underlying cloud, OS, container engine, orchestrator, scheduler, etc. In this world, ops ceases to be an IT function and becomes a product within, or even feature of, the underlying stack. This world is several years away, and before we get there I can promise the tech stack is going to get a lot more convoluted before it radically simplifies. Either way, it’ll be fun to watch.

Notes: two other fantastic overviews of the modern stack are from Joe Beda here and the team at CoreOS here.

Warehouse Computing and the Evolution of the Datacenter: A Layman’s Guide

Written by Lenny Pruss