Splitting up a monolith using a teams-first approach

Published in

NS-Techblog

7 min readJan 9, 2023

Summary

In this blog, we explain why a team-first-oriented approach is preferred to a more technically driven approach to split up a monolith into smaller services.

The team comes first. That’s also the case when breaking up a monolith!

Introduction

Every developer knows from experience that an unexpected effect of a change in the software is much faster to solve when that change is small since the problem almost certainly is caused by this small change and hence, is readily located, identified, and solved.

So by keeping the batch size of changes small (and thus keeping the deployment frequency high), we are able to address problems quickly and adequately at all times.

A monolith, on the other hand, often makes it difficult if not impossible to quickly, effortlessly, and reliably deploy small pieces of functionality to production since the entire monolith must be deployed for every small change. In addition, the complexity of a monolith is often high, making it difficult to pinpoint the (unintended) consequences of a modification. As a consequence this also means that making small changes in a monolith often is an inherently difficult and cumbersome process.

Based on the book Team Topologies, in this blog, we explain that there is not just one single type of monolith and as a consequence, there are several types of fracture planes. While doing so, we will always emphasize the importance of a team-first approach.

Finally, we discuss an application to a case at NS.

Different types of monoliths

The archetypal monolith as a metaphor for a single large and complex piece of software.

In any case, it is good to first realize that there are several types of monoliths. For example, the book Team Topologies distinguishes between:

• Application monolith: a single large application with many dependencies and responsibilities that often boasts a plethora of features.

• Joined-at-the-database monolith: different applications and/or services linked to the same database schema, making it difficult to modify, test, and deploy them independently.

• Monolithic builds: applications and/or services that cannot be built independently of each other, but depend on each other to do so.

• Monolithic releases: when components, while built independently, cannot be deployed independently.

• Monolithic model: in this world, people think that everything is part of “the total chain” at all times, and so independent deployment of components is thus by definition and a priori impossible.

• Monolithic thinking: the “one-size-fits-all” thinking, for example, that all information should originate from just one source.

• Monolithic workplace (open-space office): a uniform one-size-fits-all “collocation of bodies,” when in fact we are looking for a case-by-case optimized “collocation of purpose.”

A fracture plane is defined as a natural kind of line along which a monolith can easily be broken up, something that our Stone Age ancestors already knew how to take make use of.

Given the fact that there are so many different types of monoliths, it will come as no surprise that there are also many different types of fracture planes along which all these different types of monoliths can or should be broken up.

Different types of fracture planes

Big chunks of stone split up by their fracture planes.

We can think of the following types of fracture planes:

The business domain bounded context. These bounded contexts constitute separate and internally consistent conceptual units within a business domain.
Regulatory compliance, i.e. those parts of the system that pertain to the same (set of) regulatory requirements, e.g. credit card payments.
Change cadence, i.e. the frequency with which changes to the system are required can be used as a grouping mechanism.
Team location: the further teams are separated, the more difficult communication becomes. In extremis, you even have to deal with time differences! For a team to work well, they must sit together or use the so-called remote first principle. The software architecture will then (have to) adapt itself accordingly.
Risk: we can often differentiate separate parts of the monolith into different risk categories for the business. For example, a system may consist of both mission-critical parts and parts for which a workaround can easily be found.
Performance isolation: not all parts of your system necessarily need to handle the maximum load or have 24x7 up-time.
Technology: traditionally a common means to break up a monolith, but usually a bad idea as it often does not permit teams to remain end-to-end responsible and hence to operate as an autonomous unit. In some exceptional cases, however, this can still be a useful fracture plane.
User personas: different roles such as user or administrator may be used to divide the system into different parts.
Natural fracture planes for your specific organization or technology: Sometimes other types of team-first-oriented and organization-specific fracture planes may be identified.

All of these fracture planes are potentially valid candidates for breaking up a monolith, as long as the resulting architecture supports more autonomous teams with a reduced cognitive load. So the criterion at all times is the team: can the team handle the cognitive load of the new components/modules/services, can the team make changes independently and roll them out in production, can the team deliver business value independently, etc.

The importance of a team-first approach

As is argued above, the autonomy of a team should always be key in every decision, also when deciding how to best break up a monolith. And while this may seem obvious, in many organizations we see the opposite!

After all, teams are often assigned components. In many cases, the architecture has already been designed up-front and it is left up to the teams to implement this architecture. This leads almost always to teams being responsible for parts of this architecture, and hence they lose their ability to function as an autonomous unit from the start!

As the habit of assigning teams to up-front designed architectures and components is so engraved in organizations, it is often a challenge to turn such an existing way of working inside out. It is especially in these cases that a DevOps coach can be of added value by offering the team — and the organization — resources to realize such a change and showing how this can be achieved in small steps.

Application of the theory in practice

Sometimes, the NS also has to deal with a monolithic structure that makes a flexible and timely response to both changes and fixes either cumbersome or impossible altogether. One example is the recent modernization (break-up) of two monolithic systems at the travel guidance department (reisbegeleiding).

The initial fracture plane used for modernization was based on technology. The challenge was to find a more team-first-oriented approach that would allow development teams to directly deliver business value more autonomously to their users from the first sprint of the modernization. In this case, the choice was made to start from the aforementioned user personas as opposed to the more traditional technology-based fracture plane. The central question to depart from was: what is minimally needed to display travel information to passengers on the platforms?

In addition, monolithic thinking had to be abandoned, in the sense that all information that had hitherto come from two existing monolithic systems would necessarily have to originate from these same systems in the new situation as well.

Instead of monolithic thinking, a couple of people started anew by thinking out of the box, i.e. completely detached from the existing IT landscape. They pondered again on the original question of what it would take to display travel information on the information board of a very small single-track station. By doing so, the existing monolith turned out to disintegrate almost instantly.

Obviously, such a change not only entails a change of perspective in a technical sense. For example, in this particular case, user stories suddenly directly implied business value, whereas people were used to so-called technical user stories (i.e. without business value) to break up the monolith. It is therefore also up to product owners to ensure that all user stories contain a direct value for the business. Close collaboration with the Agile coaches should help to safeguard this boundary condition.

Conclusion

Each monolith requires its own unique (set of) fracture plane(s). A team-first approach should always be the starting point. The software architecture, surrounding processes, and organization should adapt in such a way that they optimally support the development team and enable it to operate autonomously. This way, the team will be able to deliver value to the business quickly, safely, and reliably, as well as in small increments so that feedback loops are short and timely. The approach adopted by the travel information department has shown how these principles have successfully been applied to break up an existing monolith and as a consequence, how agility has been regained.