Hate to Wait
“Capacity is fundamentally a measure of how much revenue the system can generate during a given period of time.”
In reading about capacity in Michael T. Nygard’s book “Release It!,” there is a underlying theme: there is no such thing as cheap anything.
When bad design choices are made, capacity is reduced and therefore, the company’s potential revenue decreases. To address and resolve these bad choices, companies need to make capital and operational expenses that accumulate over time. Therefore, while CPU or disk space is considered cheap (compared to their costs decades ago), there are still hidden costs.
Also, another takeway, that is evident by our own personal experiences on the Internet, is that users hate to wait.
Let’s dig deeper into terminology.
- Performance: a measure of how fast the system processes a single transaction. This can be measured in isolation or under load.
- Throughput: the number of transactions the system can process in a given time span. This is typically what users/customers care about: the performance of their own transactions.
- Scalability: there are two ongoing definitions, but the latter is the one that is commonly used. 1) throughput changes under varying loads, or 2) adding more capacity to a system.
- Capacity: the maximum throughput a system can sustain while maintaining an acceptable response time for each transaction.
First, let’s talk about cheapness. Nygard dispels the notion that central processing units (CPUs) are cheap. While he argues that the silicon microchips are cheap, the CPU cycles are not. Every cycle (particularly slow cycles) take up time and leads to latency, which may require additional servers to handle the load. His example:
“Suppose that an application takes just 250 milliseconds of extra processing per transaction. If the system processes a million transactions a day, that extra 250 milliseconds per transaction makes for an extra 69.4 house of compute time every day.”
Also, there are fiscal costs associated with slower processes. There are only so many chips that can fit into a single machine. If a machine can only fit four CPUs, then buying a fifth CPU is disproportionately more expensive. Requiring another CPU may require buying a new chassis (this requires its own RAM, local disk, cooling fans, and Fibre Channel adapter) — in terms of its location, it may need its own rack space or floor space. There is also the cost of monitoring and managing an extra box and the burden on the data center’s cooling system. It is truly best to have less.
Whether it is with storage or bandwidth, the multiplier effect demonstrates that there are much more costs beyond the individual hardware.
Secondly, it is note worthing two different ways to scale. Horizontally scalable systems can grow by adding more servers (also known as “getting wide”), whereas vertically scalable systems grow by upgrading existing servers (“getting bigs”). With the former, each server can run without knowing anything about the other servers. Examples of horizontally scalable systems include web and Ruby on Rails servers. The downside is the overhead of cluster management. With vertically scalable system, RAMs and CPUs get added to allow each individual to be as large as possible.
Here’s a great diagram to indicate how horizontal scaling vs. vertical scaling (I literally took a photo of this from the book itself):
All in all, capacity management is an ongoing process of monitoring and optimizing. Capacity is contingent on a series of external factors including software changes, traffic changes, and marketing campaigns. Unfortunately, capacity is not linear and requires a bird’s eye view of how a system is operating as a whole. Much to learn in the real world!