
Scale and its implications
Recently, I shared some thoughts on Apache Mesos. Today I’d like to cover one aspect in greater detail. One that keeps coming up over and over again in Mesos discussions: scale and its implications.
Let us first define what I mean by scale — in this context, simply the scope and size of the operation of an application. Here are a few classes with examples:
- An eCommerce app such as Magento, using a single MySQL database instance.
- A sharded MongoDB cluster, comprising maybe a dozens of machines as seen here.
- Hadoop clusters, ranging from a couple of hundreds into the thousands of nodes.
- Mesos clusters — while working perfectly fine for smaller node counts — are really optimised for cluster in the 10k++ node count range.
While in the case 1. & 2. you might be mainly concerned with the health of a particular node (I suppose especially if it’s your one and only server) as well as its utilisation and performance, once you move towards 3. & 4. a single node doesn’t count anymore.
A cruel but accurate comparison: imagine yourself commanding an entire army; sorry to say that but the horrible truth is that a single soldier is cannon fodder.
When operating applications — no matter if we’re talking of operational or analytical nature — at scale, different things matter. Ideally, you have a true datacenter operating system then that assists you to optimise the resources at hand and run your apps in a reliable way; see also Joe Stein’s intro on this topic.
I’d like to close this post with a bit of a provocation: while you can argue that not every organisation needs a, say, 40k node cluster on a daily basis, there are good reasons to think about these dimensions: be it the demands of your new-generation IoT application or be it that your business is seasonal and 5 out of 365 days a year you need 500x the compute capacity to handle the load.
Data gravity questions aside, what you’re then looking for is a solution that works both on-premise and in a public cloud setting and which is able to auto-scale up and down, based on your requirements. Guess what — Mesos can do that …