In the beginning, was the monolith (part 1)

Vadym Barylo
CodeX
Published in
6 min readOct 4, 2021

--

Often the monolithic application is build using N-tier architecture where N is the number of layers architecture requires (usually 3, but can be more). Our application consisted of 5 layers.

Security: this layer control privileged access to the application, provides the corresponding abstraction to authenticate and authorize user access. Usually can re-use part of presentation layer for own management purposes, e.g. register new users, restore access, etc. We used framework pre-build security providers, so login/logout process was implemented by the high-level API these providers offer. Also, we supported several additional pages to manage user access.

Presentation: provides a user interface(s) to communicate with the application. We used server-side rendered pages and MVC design pattern implementation as it simplifies communication between presentation and other layers.

Business: all business logic is placed here, all data processing is encapsulated in this layer. Usually, complex business entities underhood consist of many data primitives, but access from presentation allowed only to high-level domain abstractions.

Cache: this is an optional layer that can be declaratively integrated between business and data to speed up data load and to reduce pressure to persistent storage.

Data: the layer that is closest to physical data storage and usually reflects storage data contracts. The business layer can directly (or indirectly through the cache layer) access stored data through this layer by abstracting to more domain-friendly data structures.

The benefit of such an architectured solution — it is fast and easy to build. Many well-known frameworks provide good scaffolding mechanisms to build from scratch and evolve very fast. Also, communication between components and layers is as easy as “class instance — to class instance” communications. I believe there is no better architecture solution than monolith when “time” is your main concern. The only “buy, not build” is better.

But speed comes with a price. Starting as a big monolith application we realized that controlling the performance and responsiveness of particular parts is much harder when you deal with a whole. Degraded performance of one part reflects in degraded performance of the whole system. We experienced the noisy neighbor effect in its best representation. And most important, even knowing your bottleneck — there is no easy mechanism to control required resources per current load for a particular sub-system. Your lowest common denominator in this case — forbid action to avoid the harm of the whole system.

As our solution didn’t meet some of the most important quality attributes like high availability and fault-tolerant and this affected user experience, so we decided that it is already high time to accomplish our step in macro-services direction that going to be an intermediate state between full could-native (microservices) adoption.

Monolith to tightly coupled macro-services

The first stage — extract performance critical business behaviors into independently deployable and independently manageable modules. The first goal we pursued — you don’t need to worry about system performance as a whole if you can extract and tune the performance of specific parts. So the first reason for microservices adoption was about micro-tuning.

As we didn’t have enough time and resources to re-design the solution to become fully modular — we identified only the most critical parts that are candidates to be extracted from the main process. But initially, we encapsulated into a separate module full domain bounded context that contains those behaviors that going to be extracted.

This solution doesn’t contain significant differences compared to the previous. The only, “module thinking” allows us to review our overall application from a domain perspective and define bounded contexts that can co-exist independently and contains all required business actions to properly execute business requests.

Also, the module is much higher abstraction compared to layer-component-service interfaces. It is self-sufficient from a business perspective so can be considered as an additional and separate unit of “something”. For example, if the system, in general, supports the “location transparency” characteristic — the module can be used outside of the virtualized environment.

Bigger or smaller?

But how big this module should be? At this stage we ended up with a dilemma: extract module as the self-sufficient application (with own UI) or business service that exposes a full set of API to solve business tasks.

We considered that the first approach is better from a long-term perspective as allows you to fully extract a part from the whole (your uptime SLA, in this case, doesn’t contain any multiplications of other parts) and can be fully vertically supported by a separate team. But complexity is high enough to implement it. Main concerns here: security (authentication and authorization layer need to be extracted as a new reusable component) and smooth UX (there should not be feeling of application boundaries, so SSO and common layout parts like header and footer need to be supported). So this approach required a lot of work for both parts (whole application and extracted module).

The second approach is easier and allows to extract module as a separate web service with a set of business APIs that going to be used by the main application. It doesn’t have to be publicly visible so security constraints can be relaxed here. As an implementation — it can be published inside the main VPC network and not available to the outer world. Additionally, the main application has to be trained to support communication with external systems rather than just the function call of the instantiated component. In the background, it requires a service discovery and API gateway mechanism to be implemented as well, but the market is full of frameworks and technology agnostic solutions that can be used to reduce the cost of implementation, like Zookeeper, Nginx, and Registrator. So this approach was chosen as a short-term solution.

This solution was better compared to the previous one as provides support for own life cycle management for parts, but far away from expected, because does not properly implement cloud characteristics and requires additional manual work to manage service life-cycle properly (like supporting health-check, etc).

But introduced weak dependency already gave the first benefits — independent deployability and limited noisy neighbor effect, as you have separate processes that can be hosted as well independently. Eventually, other business-critical pieces were migrated as separate services. The main application behaves more as a mediator to analyze user input and call appropriate services to process. This solution was hard to compare with modern cloud microservices architecture — scaling, service recovery, guaranteed availability was partially automated. We needed to provision new services on increased load or restart failed by running specific CI agents. But we already have room for further improvements little touching applications to improve cloud characteristics — adding gateway in between and support scaling for parts.

By the end of this refactoring, our main bottleneck became database as we had single storage for all parts, so our performance concerns does not disappear, but moved to the next component layer — RDS. But I will share our steps in mitigating it in the next post.

Lessons learned:

  • microservices thinking is good, starting with microservices sometimes is overengineering (extra cost on observability, monitoring, infrastructure orchestration, etc)
  • domain model, domain boundaries, and bounded context matters — you should organize your domain into a set of isolated contexts with its business intent
  • even having a complex monolith, define core modules this application can be split into to manage individually its bounded context
  • extract your module as a black box with minimal API exposed by this module that is enough to execute defined business rules over data in this domain context
  • the zero-trust network is better from a security perspective, but more complex in implementation — just to forbid users to access some part of the system is good enough short-term solution

--

--