Modular Monoliths: Enable the Network Layer from the Start

6 min readOct 23, 2024

TLDR; To ensure effective module separation in a modular monolith, activate network transport at runtime and in production from the beginning, even if all modules are deployed in the same process. This allows for the detection and improvement of inefficient interactions early on, preventing performance and over-chattiness communication issues during future separate module deployments.

original version in french

The Problem

Working in a modular monolith is exciting and very useful, but it requires great discipline. Ensuring the possibility of separating modules at runtime is far from simple. Many pitfalls can arise when working in such an environment (like multiple modules using the same database, for example).

Today, I want to focus on another common pitfall: having excessively chatty in-proc interactions between modules, or worse, adopting an RPC (Remote Procedure Call) style between modules. These inefficient interactions can compromise performance and even question the separation of modules that one might want to achieve later.

Burn the “RPC witch”!

For decades, we have known that the promise of RPC (Remote Procedure Call) — giving the illusion that code is calling a method on a local object when it is merely a proxy for an object located in another possibly distant process — leads us to a dead end.

This “leaky abstraction” often severely harms the performance of our applications. Without realising it, we can easily multiply network calls within for loops or adopt excessively chatty communications between modules, further degrading performance.

Instead of the RPC paradigm, let’s make all the implicit explicit and make it visible in our code that there may be message passing I/Os and network.

Often Late and Frustrating Consideration

In the case of a modular monolith I mentioned above, one might realise -a bit late- that our interactions between modules are poorly suited to a network communication mode (HTTP or even gRPC). In other words, we will need to rethink them by:

Grouping/Chunking our round trips using DTOs (which, as a reminder, are used to reduce the number of network round trips between two systems)
Reviewing the contracts of the interactions between our modules for less chattiness (i.e. changing the ports in a Port and Adapter architecture), etc.

I say “a bit late” because it’s precisely when we think, “it would make sense to finally deploy these two modules independently,” that we won’t be able to do it easily, nor without additional work or effort. 😭

A Pattern I Thought I Knew Well

For my part, I have been working here and there in modular monoliths for near 8 years, composing “hexagons” in the same process to frame the boundaries and prepare for these potential separations. It’s a pattern I called the Hive, and a few of us have been experimenting and improving for years (including my friend Julien Topçu, with whom we will present it in a talk very soon at KanDDDinsky Berlin).

*The hive pattern, coming soon to your screens* 😉

Until now, I thought that implementing this pattern was sufficient to frame these boundaries between modules. We make ports (contracts) emerge between the modules that mark the boundaries, and at runtime, we can use:

Either In-Proc Adapters (i.e., adapters that only make method calls in the same process, on the ports of other modules)
Or more classic adapters, which make HTTP calls (for instance), to talk to other modules when we want to deploy these modules in different processes.

Sympa la force™

Well, I recently realised that it was something else that had protected me until now against these problems of interactions unsuitable for the second deployment mode (i.e., when we split and put network and I/O between the modules).

Until now, I had simply been…

Biased & Saved by my previous working environments

In my case, it’s rather that my use of modular monoliths has long been framed by low latency messaging solutions found in finance (e.g., 29West, renamed Ultra Messaging). These allowed the configuration to easily switch from one protocol to another without changing our code, just by configuration. We could, for example, choose a transport layer among:

IPC (via Shared-memory and ring buffer)
UDP unicast
UDP multicast (great for high throughput combined with low latency)
TCP (for infrastructures not compatible with UDP)

The consequence is that we knew — when coding — that each inter-module exchange would probably go through the network and all that it implies. We were therefore highly aware of this and explicitly did message-passing between our modules (and especially not RPC).

In other words, we made it very explicit in our code that we were talking to a module possibly deployed elsewhere (and therefore costly in terms of latency, subject to network-related uncertainties, etc.).

But That Was Before…

Today, I find that this is not necessarily the case for people who have only known the modular monolith through web API-based architectures or who even use gRPC (which unfortunately does not force people to do explicit message passing 😭).

So, my proposal to avoid having additional delays and spend weeks reviewing and resizing some of our exchange protocols between modules is:

Activate a network transport layer at runtime (e.g., HTTP), EVEN when it is not necessary in production and you deploy all your modules in the same process/pod

And for example, for us who deploy all our services on K8s, this also means ensuring:

To go back through the Load Balancing (LB) part on all our internal HTTP calls between modules in the same monolith (it distributes the load well, avoids many errors related to graceful shutdown, but also benefits us from throttling strategies at the infra level => prevents us from self-DDoSing with loop calls with ourselves)
To disable the automatic tcp-keep alive underlying our HTTP client implementations for all cases where we call between services or modules in the same K8s cluster. This is rather related to the functioning mode of Kube and allows us to systematically go back through the LB, but especially to avoid maintaining illusory connections to pods that have already been preempted, for example (elasticity and finops oblige). We avoid a good bunch of 503 errors and retries. However, be careful not to disable tcp-keep alive if you consume another module deployed in another region. Because in this specific case, you will feel the cost of the systematic TCP handshake 😄

Okay, But What About Performance?!?

As usual, you just need to measure rather than imagine something (and this heuristic — it — is always valid 🙂). In other words:

And even if you then deploy all your modules separately, but in the same K8s cluster as is the case with us, the impact of doing I/Os will be really strongly limited, even imperceptible for users. In any case, the fact that you have trained to go through the network even when these modules were grouped in the same process will have served as a rehearsal for this upcoming separate deployment.

A Necessary Feedback

And if this is not the case and you notice latency issues on certain uses, it is very likely that you should review your exchanges and interface contracts (“port” in the hexagonal architecture sense) between modules.

But the fact of introducing the network from the start between your modules as if you wanted to deploy them separately right now will have a cardinal virtue: providing early back-pressure against your bad designs, bad ideas, and bad interaction contracts between modules within your distributed monolith.

I don’t know about you, but as far as I’m concerned: getting feedback as quickly as possible on bad design choices is something that delights me 🙂