Monolith to Microservice migrations. Don’t start until you have read this …

Published in

AI+ Enterprise Engineering

12 min readApr 20, 2021

TL;DR

By simply containerizing your existing applications in a lift/shift manner, you will gain tremendous benefits from the operations overheads; fundamentally, the orchestration and management of myriad servers individually isn’t an easy endeavour. Containerization provides a level of management abstraction over your infrastructure needs, making it easier and simpler to manage, scale and speed-to-market where it counts for the business operations. However, there are times where lift/shift isn’t enough, and a refactoring of the existing application is required.

This is a long article on my approach to modernize a legacy enterprise monolith to the microservice architecture. I had tried to make it shorter, but then I realize that that would cut out the rationalizations that are to be considered when you make your design decisions in the migration endeavour. In this article, my approach may seem counter to what has been preached before and YMMV; it only exemplifies that migrations from monolithic applications to microservices are complex. Hopefully, this article will tweak your pique when engaging in such migrations.

Hybrid Monolith vs Total Re-Write

The buzzword of Microservice architecture has been with us for a couple of years now. It fits nicely to a particular use case, one in which where you can fairly divide up your solution into individual compartments, including the business logic and the associated data. Fundamentally, if you can design the solution architecture comprising different compartments of business logic with the associated data that is primary for that compartment’s context, that’s a prime candidate to use the SOA approach and perhaps implement it using microservice architecture.

The complexity is when you have an existing monolithic application tied to a relational database with transactional processing between tables (common in most enterprise and e-commerce applications), and you’re tasked to modernize this into the microservice architecture while still maintaining the same business flow/logic processing. In other words, the system users expect everything to be the same from a functionality/behaviour standpoint, but the engineering behind it is expected to change architecturally. From an analogy angle, it's like going from a gasoline-powered vehicle to a fully electric-powered one without any change to the UI and functionality; the UI and behaviour aspects are the same, but the engineering under the hood is a whole different animal.

If the monolithic application that was designed with loosely-coupled modules and the database schemas are fairly independent of each of the modules, you might have a fair chance to modernize this to a microservice architecture. However, if your modules are tightly coupled (as most of the enterprise applications I’ve encountered), and the database schemas have strong, cohesive dependencies to each other, it will be a mammoth task to break this down into modular components and separating the data models into contextual boundaries; such application designs usually have the data-schemas that have tables that are highly cohesive to each other, and the best-practice where the tables have been normalized makes decomposition even harder.

If the task is to migrate such a monolithic application to a microservice architecture, I would strongly recommend a total re-write of the application instead of re-factoring bits and pieces from the monolith to be microservice components in a hybrid way while maintaining the service operationally and functionality. Having a partial monolith dependent on clumps of microservices is a ticking time-bomb, and in the worst scenario, your end-state microservices implemented will be like a Big-Ball-of-Mud. Using the gasoline/electric car analogy above, it's like taking on the task of patching the gasoline car to become a hybrid-powered car (part gasoline, part electric). If the gasoline car was engineered in a component/modular fashion, there might be some possibilities but in most well-engineered gasoline cars with optimum couplings from the engine to the drive-train to the dashboard (operations), it makes patching it evolve to a hybrid-car an almost impossible task.

The added complexity of having to run in parallel the monolithic’s database in conjunction with the microservice’s own data-stores since part of the data model is with the monolithic and the other bits with the microservices. Ensuring that they are all in sync requires some amount of engineering magic to be implemented for the migration duration.

Best to start from a clean slate. And while a re-write sounds like a lot of effort and counter-productive, in reality, it's much faster with lesser design and operational challenges than to juggle re-factoring bits and pieces of the Monolithic into Microservices and crossing fingers that the converted parts have taken all dependencies into considerations and somehow integrate back to the Monolith. Sometimes, it's simply impossible to break parts off the Monolith and maintain functionality; a re-write is the only sane choice.

Depending on the migration endeavour, sometimes a re-write is the only productive way forward. Having a hybrid solution (part monolith, part microservices) is a recipe for disaster as more effort is required in keeping both parts running.

From the Monolith to Microservices

As a first step to re-writing the application, we will need to understand its functionality, behaviour, and data schemas. Rather than looking at the existing application source codes, their coupling and dependencies to understand the functionality, I would start instead by taking a holistic view of just the application’s data model used. Usually, by looking at the data schemas and their relationships, I can get a good glimpse and overall understanding of the data requirements and a sense of the application’s logic instead of wading through tons of source material. A lot can be gleaned just by understanding the data schemas used, and I usually find this clarity much quicker than going through the code itself. On several occasions, because of my insight into the data schemas, I’ve found logic bugs in the implementation source codes (later) that were caused by change requests and patches by other developers not having this lens into the data schemas.

Understanding the existing data-schemas provides clarity on the persistent data structures and models required for the application’s functionality.

Armed with the clarity of the data schemas and relationships, I would walk through the application’s UI and functionality together with the user to better understand the user data inputs collected and the expected responses/reports to be generated. If the users come from different departments or have differing roles/functions and thus have different UI, I would note such separations of functionalities. If there are user stories, example-mappings, behaviour-testing cases used in the platform's design and testing, even better as these form the accepted design criteria and use-cases baseline. By co-relating with the data-schemas and structures gleaned in the above step, I can now do a behavioural/functional map to the co-related data schemas that support them. This allows me to see if there are provisions or gaps in the data model to support or affect the system's information entropy.

By doing this step, I now have a data-oriented view of the application, the flow-process of how the data will be transformed from the different user inputs and the clusters of data schema that are contextual to each other. By correlating to the application’s logical sections based on the SOLID design principles, a high-level, modular function/data map based on the data clustering entropy can be formed. This mapping helps me visualise what functional modules are required and the associated data structures in a modular fashion. There will be modules that share common data structures at this stage, and these modules can be part of a bounded context (in DDD’s parlance). There will be situations when the modules aren’t part of the same bounded-context (in your decomposition fashion) but share some common data-fields; these common data-fields would be the linking elements between separate bounded-contexts. I usually do this step on a whiteboard with post-it notes and by scrubbing functional requirements to data schemas to modules (SOLID based) to achieve a modular decomposition of functionality and its related data schemas. There isn’t an exact science to this approach as this isn’t truly reverse-engineering. But when properly executed, you will have achieved the important step in modularizing the monolith.

Corelating the usage-functionality and the data process-flow with the data-schemas maps, a high-level functional-module/data map can be formed. These modules/data can be clustered into its relevant bounded-context to aid clarity in the visual-inspection to ensure that information-entropy has been kept while decomposition of the functionalities into modules.

Designing the Microservices

The functional modules/data map as developed from above separates and clusters the application into its relevant bounded context. The idea is that each bounded-context can be powered by one/many microservices delivering the necessary APIs that provides the business logic and functionality for that bounded context; each module is a candidate to become a microservice. This is more an Art than Science (as mentioned before) in which we are trying to establish some form of componentization based on the single-responsibility attributes per module, together with the information entropy from the data schemas once you have separated them relevantly. As mentioned above, determining the bounded-context, what modules (or microservices) in that bounded-context, how large the microservice should be (in terms of number of APIs), etc., are more of an art-form than an engineering science. The more experience you have in understanding and designing enterprise architecture, data architecture, organisation processes, etc., the better would be your microservice architecture design to the application.

While DDD provides a paradigm/tool to help organize the design patterns of a complex system, decomposition of a monolith application functionality and services into relevant Microservice architecture is more of an art than science. Software Architecture & Development is a craft and requires experienced craftsmen in such endeavour.

Microservices are essentially Distributed Systems.

The Caveats when designing Microservices

When moving from a monolithic design to a microservice architecture, certain aspects of microservice design need to be aware of. The following lists (non-exhaustively) some of these areas:

Data Propensity in Microservice Architectures: As you split the data schemas (above use case) to fit the modules/microservices, there will be situations where it will be necessary to repeat the same data fields in another module/microservice node. This is because each microservice by design should be self-contained with its own database, ensuring the isolation and loose-coupling that microservices are known for. Having the same repeated data fields in separated microservices requires special handling, especially during changes and updates to those data fields that affect all microservices that have them. Several design patterns have been used to achieve data consistency across all microservices, from CQRS, event-sourcing to Saga patterns. Fundamentally microservices are a form of distributed systems, where each microservice is an isolated, contained system by itself. In such systems, the laws of physics prevail, and we have the CAP theorem to contend with; the microservice architecture is an architecture where the data-stores have been partitioned but have repeated data-fields available in different microservices. Essentially we need to work with eventual-consistency behaviour in our application design if there are microservices that share data fields and we want some form of data-consistency in our solution.

Distributed Communications: As mentioned, microservice architecture is essentially a distributed systems architecture where each microservice is a self-contained node. There are known challenges when it comes to distributed systems. And as no microservice is an island (otherwise, it might as well be a monolithic system), it needs to communicate with other microservices. As each microservice is a self-contained system, it needs its own communications stack to interface with other microservices. While there are no sanctioned standards what is this communication mechanism, it is safe to say that all the nodes need to implement the same to facilitate the communications properly.

This is where it becomes interesting. If all the microservices are developed with the same software development stack/runtime, and the runtime stack provides mechanisms for communications out of the box, leveraging this communications capability into the microservice nodes will be a relatively easy chore. However, by the microservice mantra where each microservice node can be developed by different teams with a different development stack/runtime, a means to standardize this communication mechanism is required. E.g. one development team may be using the Python stack/runtime, while another using the Java stack/runtime. To ensure the microservice nodes from both teams can communicate with each other, a standardised communications framework would need to be included. Every node will need to be implemented with this ‘utility’ communications framework in the coding. This is fine if all you have is a small number of microservices, but as more and more microservice nodes are developed (and with different development stacks/runtimes), the chore of implementing this communications framework becomes a toil on the developers (using SRE terms here).

Taking this into account, several vendors have developed mechanisms to remove this toil. This is the domain of service-mesh, often implemented as a side-car to each microservices node. The service-mesh in typical implementations provide layer-5 (session-layer) in the 7-layer OSI network communications model. The side-car acts as a proxy for the node to integrate into the service-mesh; it provides a means for the microservice nodes to communicate in a service-2-service fashion, and while it’s doing this, the service-mesh takes on the additional responsibilities to ensure that the communications are secured and optimally managed.

Sidecar acts as the Proxy for each Microservice Node.

The question is, at which point do you need a service-mesh in the microservice design for peer-node messaging? If all you have is just a couple of microservices, then perhaps a service-mesh might be over-kill. I’ve worked on projects with even up to 20 microservice nodes from different teams and still not need a service mesh. If all project teams use the same software stack/runtime, having a service mesh might become moot and counter-productive. If there is a plan to increase the number of nodes and having different development stacks from different teams, it might be wise to invest the effort into implementing a service mesh.

Service-Mesh provides a consistent mechanism for microservices peer-to-peer communications, removing the toil of implementing the same communications utility into different software-stacks/runtime, thus simplifying development efforts. However, the Service-Mesh is another endeavour by itself and it needs to warrant is own value in the scheme of needs.

Testing and Operational Complexity: Debugging and testing a distributed system is more complex than the single monolithic application host. There aren’t any IDE styled interactive debugging utilities that can span from one microservice system to another, looking into the memory and variable’s state. As microservice nodes may span multiple physical systems, the mechanics for debugging microservices is similar to debugging any other distributed systems — reviewing trace logs and correlation of events. Investment into a standardized logging framework will facilitate this important endeavour. The logging framework will provide necessary searching and correlational aspects. Service-Mesh products too provide logging capabilities, providing detailed events per microservice node when they are utilized. Debugging using such event logs aren’t intuitive and requires experienced craftsmen to be able to diagnose issues. Tracing race conditions between microservices isn’t easy, and sometimes such events aren’t repeatable, making debugging even more complex.

Operations-wise, you will have a disparate network of microservice nodes to monitor their health and performance. While there are tools to assist in monitoring the health and performance of each node, the system’s overall performance and dependencies are other complex areas to deal with. Suffice to say that troubleshooting performance issues are at another level of complexity than simply profiling processes running on a single compute node.

Testing and operating distributed systems is more complex compared to a monolithic application running on compute-node processes. Even when you scale the monolithc application horizontally with multiple compute-nodes, the business-logic is still contained within the scoped compute-node’s boundaries.

Failure Management: Designing microservice architecture isn’t simply decomposing the application’s functionality into integral components loosely coupled in a disparate network. Depending on your design, microservices may rely on other microservices for functionality, creating a dependency map of microservice nodes. Failure in any node will tend to have cascading effects on the overall system, and you need to design mitigative processes when such situations happen. There are design patterns like Bulkhead and Circuit-Breaker that help mitigate failures, and some service-mesh (e.g. Istio) have them as part of their value-add features on top of the communications capabilities.

Summary:

Migration from a legacy monolithic application to modern microservice architecture is not an easy walk-in-the-garden variety of IT projects, especially when the ask is to have both monolithic applications and the microservices co-exists in parallel during the transformation. Coupled with the strong cohesiveness in the database schemas, decomposition of the monolithic into bounded context scopes is not easy; this article gives some insights into the challenges in doing such a migration. While the new-shiny toy is microservices architecture, there is nothing wrong with the monolithic architecture if it was designed correctly (e.g. loosely-coupled component modules); there is no dispute over the advantages that microservices architecture bring, but there needs to be a rationalization and weighing of the pros/cons before making the jump.