DevOps Concepts: Pets vs Cattle
Operational Service Models
This is one of the cardinal concepts in DevOps is the notion of pets vs cattle for the service model. This was first introduced by Bill Baker on topic of scaling up vs scaling out presentation that was included slide deck titled Scaling SQL Server 2012. It was later introduced popularized by Gavin McCance in CERN Data Centre Evolution presentation.
Pets Service Model
In the pets service model, each pet server is given a loving names like
athena. They are “unique, lovingly hand-raised, and cared for, and when they get sick, you nurse them back to health”. You scale these up by making them bigger, and when they are unavailable, everyone notices.
Examples of pet servers include mainframes, solitary servers, load balancers and firewalls, database systems, and so on.
Cattle Service Model
In the cattle service model, the servers are given identification numbers like
web-05, much the same way cattle are given numbers tagged to their ear. Each server is “almost identical to each other” and “when one gets sick, you replace it with another one”. You scale these by creating more of them, and when one is unavailable, no one notices.
Examples of cattle servers include web server arrays, no-sql clusters, queuing cluster, search cluster, caching reverse proxy cluster, multi-master datastores like Cassandra, big-data cluster solutions, and so on.
Evolution of Cattle
The cattle service model has evolved from Iron Age (bare metal racked-mounted servers) to the Cloud Age (virtualized servers that are programmable through a web interface). This is a brief overview of the platforms and tools that have evolved in each of these eras.
The Iron Age
During the The Iron Age of computing, it wasn’t until the introduction of hardware virtualization that gave rise to managing systems of cattle. Robust change configuration tools like Puppet (2005), CFEngine 3 (2008), and Chef (2009) allowed operations to configure fleets of systems using automation.
The First Cloud Age
In this initial era, virtualization was extended to offer IaaS (Infrastructure as a Service) that virtualized the entire infrastructure (networks, storage, memory, cpu) into programmable resources. Popular platforms offering IaaS are Amazon Web Services (2006), Microsoft Azure (2010), Google Cloud Platform (2011).
Such services gave rise to push-based orchestration tools like Salt Stack (2011), Ansible (2012), and Terraform (2014). These tools allowed you to coordinate state between the cloud provider and your application, and essentially allow you to program infrastructure, a pattern called Infrastructure as Code.
The Second Cloud Age
While automation was built to virtualize aspects of the infrastructure, there were early movements to virtualize or partition aspects of the operating system (processes, network, memory, file system). This allows applications to be segregated into their own isolated environment without the need to virtualize hardware, which in turn duplicates the operating system per application. Some of these technologies include OpenVZ (2005), Linux Containers or LXC (2008), and Docker (2015).
The introduction of containers became explosive with Docker becoming a ubiquitous ecosystem in and of itself. A new set of technologies evolved to allocate resources for containers and schedule these containers across a cluster of servers: Apache Mesos (2009), Kubernetes (2014), Nomad (2015), Swarm (2015).
These tools give rise to what is called Immutable Production, where disposable containers are configured at deployment.
The Current Ecosystem
With cloud computing platforms, where you can program the infrastructure (infrastructure as code) and apply immutable production with containers, orchestration tools tend to be more popular. There will still be niche use cases where runtime change configuration is required on both servers managed as cattle or pets. Currently, Ansible, Terraform, and Chef has been the most popular platforms in recent years (personal experience).
For container scheduling solutions, Kubernetes is now the ubiquitous ecosystem with implementations on popular cloud platforms: Google Kubernetes Engine (GKE), Azure Container Service (AKS), and soon to be released AWS Elastic Container Service (EKS).
In the big-data or streaming space with distributed platforms — Spark, Kafka, Flink, Storm, Hadoop, Cassandra, Samza, Akka, Finagle, Heron, to name but a few — Apache Mesos ecosystem is popular. Mesosphere and DC/OS are platforms that leverage Mesos to create a complete system for orchestrating these clusters. There’s support to also orchestrate Kubernetes as a cluster on top of DC/OS, giving you access to both platforms. Now you can schedule Kubernetes to schedule applications that use your big data clusters scheduled by DC/OS.
Pets vs. Cattle Concept
- Scaling SQL Server 2102 by Glenn Berry (incorporates earlier work by Bill Baker), PASS.org, 2012
- CERN Data Centre Evolution by Gavin McCance, DevOps at CERN, Nov. 19, 2012
- Are your servers PETS or CATTLE? by Shimon Sharwood, The Register, Mar. 18, 2018
- Pets vs Cattle by Noah Slater, Engine Yard, Feb. 26, 2014
- The History of Pets vs Cattle and How to Use the Analogy Properly by Randy Bias, CloudScaling, Sep. 29, 2016
- InfrastructureAsCode by Moartin Fowler, Mar. 1 2016
- Infrastructure as Code: A Reason to Smile, Jafari Sitakange, ThoughtWorks, Mar. 14, 2016
- Infrastructure as Code: From the Iron Age to the Cloud Age, Kief Morris, ThoughtWorks, Jan. 8, 2016
- ImmutableServer, Kief Morris, Martinfowler.com, Jun. 13, 2013
- SnowflakeServer, Martin Fowler, Martinfowler.com, Jul. 10, 2012
- PheonixServer, Martin Fowler, Martinfowler.com, Jul. 10, 2012
- Trash Your Servers and Burn Your Code: Immutable Infrastructure and Disposable Components, Chad Fowler, Jun. 23, 2013.
- An introduction to immutable infrastructure, Josh Stella, Oreilly Ideas, Jun. 9, 2015.