DevOps Concepts: Pets vs Cattle

Operational Service Models

This is one of the cardinal concepts in DevOps is the notion of pets vs cattle for the service model. This was first introduced by Bill Baker on topic of scaling up vs scaling out presentation that was included slide deck titled Scaling SQL Server 2012. It was later introduced popularized by Gavin McCance in CERN Data Centre Evolution presentation.

Pets Service Model

In the pets service model, each pet server is given a loving names like zeus, ares, hades, poseidon, and athena. They are “unique, lovingly hand-raised, and cared for, and when they get sick, you nurse them back to health”. You scale these up by making them bigger, and when they are unavailable, everyone notices.

Examples of pet servers include mainframes, solitary servers, load balancers and firewalls, database systems, and so on.

Cattle Service Model

In the cattle service model, the servers are given identification numbers like web-01, web-02, web-03, web-04, and web-05, much the same way cattle are given numbers tagged to their ear. Each server is “almost identical to each other” and “when one gets sick, you replace it with another one”. You scale these by creating more of them, and when one is unavailable, no one notices.

Examples of cattle servers include web server arrays, no-sql clusters, queuing cluster, search cluster, caching reverse proxy cluster, multi-master datastores like Cassandra, big-data cluster solutions, and so on.

Evolution of Cattle

The cattle service model has evolved from Iron Age (bare metal racked-mounted servers) to the Cloud Age (virtualized servers that are programmable through a web interface). This is a brief overview of the platforms and tools that have evolved in each of these eras.

The Iron Age

During the The Iron Age of computing, it wasn’t until the introduction of hardware virtualization that gave rise to managing systems of cattle. Robust change configuration tools like Puppet (2005), CFEngine 3 (2008), and Chef (2009) allowed operations to configure fleets of systems using automation.

The First Cloud Age

In this initial era, virtualization was extended to offer IaaS (Infrastructure as a Service) that virtualized the entire infrastructure (networks, storage, memory, cpu) into programmable resources. Popular platforms offering IaaS are Amazon Web Services (2006), Microsoft Azure (2010), Google Cloud Platform (2011).

Such services gave rise to push-based orchestration tools like Salt Stack (2011), Ansible (2012), and Terraform (2014). These tools allowed you to coordinate state between the cloud provider and your application, and essentially allow you to program infrastructure, a pattern called Infrastructure as Code.

The Second Cloud Age

While automation was built to virtualize aspects of the infrastructure, there were early movements to virtualize or partition aspects of the operating system (processes, network, memory, file system). This allows applications to be segregated into their own isolated environment without the need to virtualize hardware, which in turn duplicates the operating system per application. Some of these technologies include OpenVZ (2005), Linux Containers or LXC (2008), and Docker (2015).

The introduction of containers became explosive with Docker becoming a ubiquitous ecosystem in and of itself. A new set of technologies evolved to allocate resources for containers and schedule these containers across a cluster of servers: Apache Mesos (2009), Kubernetes (2014), Nomad (2015), Swarm (2015).

These tools give rise to what is called Immutable Production, where disposable containers are configured at deployment.

The Current Ecosystem

With cloud computing platforms, where you can program the infrastructure (infrastructure as code) and apply immutable production with containers, orchestration tools tend to be more popular. There will still be niche use cases where runtime change configuration is required on both servers managed as cattle or pets. Currently, Ansible, Terraform, and Chef has been the most popular platforms in recent years (personal experience).

For container scheduling solutions, Kubernetes is now the ubiquitous ecosystem with implementations on popular cloud platforms: Google Kubernetes Engine (GKE), Azure Container Service (AKS), and soon to be released AWS Elastic Container Service (EKS).

In the big-data or streaming space with distributed platforms — Spark, Kafka, Flink, Storm, Hadoop, Cassandra, Samza, Akka, Finagle, Heron, to name but a few — Apache Mesos ecosystem is popular. Mesosphere and DC/OS are platforms that leverage Mesos to create a complete system for orchestrating these clusters. There’s support to also orchestrate Kubernetes as a cluster on top of DC/OS, giving you access to both platforms. Now you can schedule Kubernetes to schedule applications that use your big data clusters scheduled by DC/OS.

References

Pets vs. Cattle Concept

Related Articles