The Scale Cube

Karan Sharma
Jul 11 · 2 min read

The scale cube is a useful visualization of a three-dimensional scalability model, shown in the figure below :

This cube is described in Martin Abbott and Michael Fisher’s excellent book, The Art of Scalability (Addison-Wesley, 2015).

The Scale cube defines 3 separate ways to scale an application: X, Y, and Z.

X-Axis scaling load balances requests across multiple instances. It is a common way to scale a monolithic application. Multiple instances of the application are run behind a load balancer. The load balancer distributes the requests among N identical instances of the application. This way of scaling improves the capacity and availability of the application.

Z-Axis scaling also runs multiple instances of the application, but here each instance only works on a subset of data. The data is partitioned amongst these N identical instances and load balancer distributes and routes the request by using a request attribute. An application might, for example, route requests using userId.

Y-Axis scaling functionally decomposes an application into services. (aka MicroServices).

X- and Z-axis scaling improves the application’s capacity and availability. But none of these approaches solve the problem of increasing development and application complexity.

Y-Axis scaling splits the application into multiple services. Each service performs a specific function. So, Y-Axis scaling decomposes a large monolithic application into small services, each of which can be scaled further by using X-Axis and Y-Axis scaling independently.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade