Capacity

As I continue to read through Michael T. Nygard’s Jolt award winning book Release It!, I am constantly reading about topics I am optimistic will stick in my head for future use. However, I know how much my mind will hold in the weeks to come. To this purpose I hope to store here ideas that will benefit my forgetful future self. Today’s post is about Chapter 8 in the aforementioned book and consists of the major takeaways that I fear might move beyond my brain’s capacity.

First off is a definition and clarification of the word capacity in the context of building large systems. Primarily capacity is “the maximum throughput a system can sustain, for a given workload, while maintaining an acceptable response time.” It is important not to conflate this with the similarly related ideas of performance, throughput, or scalability. Understanding the capacity of a system requires the ability to find different variables and constraints that will effect your system. There is no fixed value that can be used to compute capacity but, generally, it boils down to: end users care about performance of their transactions while customers are interested in either throughput or capacity.

Finding the variables that contribute to a calculation of a system’s capacity is difficult but Nygard provides some general advice to consider the system as a whole. A first step is finding the “driving variables”. His examples include things such as user demand, the clock, the calendar; generally things that are out of your control. “Following variables” are correlated to one or more driving variables and consist of “measurable performance statistics” such as bandwidth, CPU, memory usage, etc. Identifying these two types of variables and their interactions throughout the different layers of a system is the hard part. Once they are found alleviating the constraint will increase capacity.

Finally, there are two ways to refer to scaling a large system, horizontally and vertically. Horizontal scaling means adding additional servers while vertical scaling means making the current servers more powerful. Horizontal scaling is easier the less each component knows about other components whereas vertical scaling should be used when creating more servers is impractical or impossible. Generally, horizontal scaling is more financially flexible.

These are a few of the important ideas to keep in mind as I do a closer inspection of the capacity patterns and anti patterns in the chapters to follow. There are pitfalls aplenty when considering how to optimize the capacity of a system. Keep in mind that nothing is cheap even in the world of seemingly infinitely powerful computers! A couple other quotes I hope to impart upon my future self before I move on are these: “Protect request-handling threads”, “Monitor capacity continuously”, and “Improving non-constraint metrics will not improve capacity.” Thanks future me, hopefully your capacity hasn’t blown up yet.