Goldilocks services

Monolith vs. micro services.

A question in the rich tradition of interminable debates such as “vi vs. emacs” or “tabs vs. spaces”. But there is an answer that is both surprising and obvious at the same time: the goldilocks services. Services that are not too big, not too small, but just right.

First, the monolith vs micro services debate is one that can be answered at the code, data or runtime layer. Often proponents of either go all hog on the 3 aspects but it does not need to be this way. More on this later.

The advantages of the monolith are pretty obvious. It’s simple, self-contained. Components communicate in memory, exchanging objects as rich as you want them to be. There is little overhead that is not related to the application itself.

The advantage of the micro-services are also obvious. Each service has all the advantages of the monolith, but they guarantee independence between each other and prevent unnecessary coupling.

Micro-services are like the OO of architecture. Proponents love combining trivial unit of work and underestimate the overhead of the communication. Monolith fiends are like scripting fans: give them a swiss army knife and they will do everything. And like every swiss army knife, you may not be able to use the bottle opener at the same time as the hacksaw but who cares.

To solve this dilemma, we need to look towards how software projects are built. They start from scratch, and grow. Early projects simply cannot afford the overhead of micro services. I laugh when I see posts on Hacker News that sound like “I did this project over the weekend, and wrote 10 micro services to power it”. Sounds like you just needed 10 objects in your monolith to me.

My rule of thumb is that if you software isn’t making money, it’s not a candidate for micro services. And yes, your series A doesn’t count if customers aren’t paying you.

But how far can you push a monolithic application ?

The first breaking point comes when the codebase doesn’t fit in a normal developer’s head. And by normal, I don’t mean the founder/CTO who wrote 90% of it, I mean the developer #20 who’s been at the company for 3 months. If your code was so sloppy that even the founder/CTO doesn’t understand it, you need to fix that first, micro services won’t help you. If you still want to push the monolith past the ‘fit in the head’ point, use a hybrid strategy. Keep the monolith for deployment/data, but start creating services in your code. Sometimes that means creating packages (ruby gems, npm modules, python packages, whatever), sometimes cleaning up your code so interdependencies are explicit is enough.

The beauty of the monolith at this stage is that you can get it wrong, and a bad architecture decision can be fixed by a single deploy of a code change. If you have made a bad architecture decision with micro services and poorly defined your boundaries, moving those boundaries is insanely slow and risky. You have to do everything in lockstep, migrate code and data in a way that will not break any service. Compare this with a simple method refactor in a codebase and see what bliss feels like.

The second breaking point of monoliths is because of Dunbar’s number. Above roughly 150 developers, it’s almost impossible for anyone to understand everything that is happening. Coordination of the development on a large monolith becomes the limiting factor. Good software architecture will save you, but only until you maintain excellent practices on where boundaries are. At this point you will have a hybrid architecture, where clear service boundaries exist within the monolith. If your code is a mess of spaghetti, you probably need to go back there first before attempting a migration to micro services.

The last one is Conway’s law: software tends to mirror the organization that built it.

As the organization grows, independent teams form and they will likely introduce separate services. Even if you are a fan of the monolith, it is likely a good thing, as long as you keep the runtime environment pretty standardized. For example, if you use a framework, make sure everyone is using the same versions. If you need to mix the languages/environments you use, make sure the guidelines for using one or the other are explicit. This is also where the role of the SRE will increase as the glue layer between all the various organizations.

Now of course, there are situations for which a service extraction makes sense earlier on. Regulations, security, different scaling characteristics, etc. But those are mostly concerns of late stage companies. Yes, your Facebook competitor will need something different to handle the million of page views you’ll get. For now, you are getting more traffic on your administration console to check on those visitors that you are waiting for.

So you don’t need independent services. If you are building them, make them as big as possible, but not too big. Stick with a monolith until it becomes obvious that large parts of it should be extracted.