How to rewrite legacy applications or split monoliths into microservices without slowing down on delivering features or introducing bugs in your system.
The need to rewrite whole or parts of your applications arises frequently. Be it a legacy modernization initiative where your organization plans to move from monoliths to microservices, or some parts of your application have become so complex and slowed the development process, that they call for a rewrite.
But this kind of change in software architecture is a big undertaking:
- Rewrites are risky. There is a potential to introduce bugs in an otherwise working software, which in certain cases may result in revenue loss and customer drop out for your organization.
- It does not add any value to the user. Continuously improving the maintainability of your code is quite important but often times a hard sell to your management. Feature requests tend to get priority over such improvements.
In the rest of this post, we’ll go through an evolutionary approach to rewriting applications, which can navigate through the challenges and constraints mentioned above. We’ll use a simplified eCommerce application as an example, but the principles can be applied to cases of varying sizes and complexity. Also, the example has a flavor of microservices and RESTful JSON request-responses but can be applied to other structures like XML, libraries, and objects, or GraphQL.
A rewrite here means that while the code would be much more readable and maintainable, the input and outputs will remain the same. Even if your eventual goal is to make changes to the structure of inputs and outputs, the suggestion here would be to go one step at a time and improve the underlying code structure of the current set of inputs and outputs first.
Step 0: The Initial State
The e-Commerce application has three modules: inventory, pricing, and shipping. For certain valid reasons, you identified that pricing needs to be re-written, either in the same language or a different one. Let’s say that Pricing API has four fields: price, discount, tax, and total price for a product.
Step 1: Adding a new module/microservice as a proxy
The first step is to have a foundation for the new module, which will accept pricing related requests and forward them to the legacy module, and do the same for the responses from the legacy pricing module. It will later expand into a full-fledged replacement of legacy pricing.
Step 2a: Partial rewrite (with errors)
You can now start implementing the logic in the new module. Let’s say you decided to rewrite the tax and total price calculation first.
The crux of this approach is a small sub-module in your new code, which we call the merging utility or the migration utility. The responsibilities of this utility are:
- Merging the responses from the legacy pricing and new pricing.
- In case of conflicts, defining the rules of resolution. In this case, the resolution would be falling back to the values from the legacy response as it is the source of truth.
- Reporting the conflicts. It can be as easy as logging appropriately so that you can build visualizations in the logging/monitoring tool like Kibana.
As you begin re-writing logic, you can use all the best practices for clean code. Test-driven development is really advisable to build the safety net of unit tests.
The trouble with unit tests:
Writing unit tests is a manual process, and you may understand the requirements incorrectly. Moreover, people who rewrite the code may not be aware of all the corner cases. That may lead to a wrong or incomplete implementation being unit tested, and subsequently bugs in a feature that was working perfectly in the legacy code.
This is where the merging utility shines. It compares the responses from new code against the actual data in production, gives you the confidence that even in the worst case, your application will behave as expected, and provides you the list of problems you need to fix reported.
(In the following diagrams, 2915 represents the number of times there was a discrepancy. The monitoring tools like Kibana are capable of providing links to the details of each discrepancy.)
In this case, the discount was not reduced from the total price, which led to an incorrect total price being calculated in the new module. The merging utility caught the error and used the correct values from the legacy module instead. It also logged the discrepancy to be picked up by the logging and monitoring setup.
Step 2b: Fixing the errors
Once the issue is shown on the monitoring tool, you can troubleshoot and fix the issue, and modify your unit tests to reflect the correct state. Naturally, the discrepancy will go away from your monitoring dashboard once the code makes it to production.
Step 3: Adding new features in the partially written new module
Before you could rewrite the whole pricing module, you were given a requirement to add the tax percentage to the response.
This is really easy, just extend the new module to add tax percentage, and it will be available in the final output. The beauty of this approach is that you do not need to make any changes to the legacy codebase to achieve the result, and the new module can be extended to deliver a new feature even if the rewrite is not yet complete.
Step 4: Changing Existing Features through the partially written new module
Let’s say that there is another requirement that needs to be implemented. The tax needs to change from 19% to 16% (This actually happened in Germany during the covid-19 outbreak. The government reduced the taxes from July 2020 to boost the economy).
This requirement is a bit tricker, as a change in tax percentage impacts the tax amount and the total price. This will lead to a divergence between the response of legacy pricing and the new module.
You’ll need to tweak the merging utility to provide it a list of fields which it should pick from the new module, regardless of whether they are the same or different in the legacy response. In this case, the fields that should be overridden by the new response are tax, tax percentage, and total price. Accordingly, the merging utility will not report these differences as discrepancies.
Step 5: Rewriting the remaining parts
Now that you have delivered the critical features and have some slack time, you can go back to rewriting the remaining parts in the new module. In this example, it would be writing the logic to calculate the discount as 10% of the initial price. Assuming that you write the correct logic (and don’t forget to write unit tests), the remaining discrepancies will go away from your monitoring dashboard.
Step 6: Cleaning up and the final state
After a few days of observation, if there are no new discrepancies shown by your monitoring dashboard, you can be fairly confident that the rewrite has been successful. You can remove the merging utility, cut the cords between the new pricing module and the legacy module, as well as clean up the legacy module by deleting the legacy pricing code.
Software rewrites often don’t add user value, and as such, they should not be exciting to work upon, they should definitely not be a cause of anxiety. They may look daunting and difficult to justify or execute, but the approach suggested here can address both of the concerns. It can ensure a bug-free migration, and leave room for re-write to be parked when you have a critical business requirement to be implemented, and come back later when your team has some bandwidth.
I came across this approach while reading the book Building Evolutionary Architectures. My team has rewritten code-bases of different sizes and complexity a couple of times, without putting any business features on the back burner, and without a need for any kind of special testing. We were able to achieve zero bug rewrites. Continuous Migration, happy teams, happy businesses.