Rewriting legacy systems as legacy

Anand Raman
Kingfisher-Technology
3 min readMay 30, 2024

In my previous post, I made a case that “fear” experienced by engineers is a better indicator of legacy systems rather than “age”, the “tech stack”, or the “time” spent in the design and implementation of a system. I even suggested that we may be creating legacy systems as we are rewriting them on modern architectural guidelines and tech stack.

Since then, some of you have asked me to share lived experiences. I intend to do just that. While I will not take names (to protect the innocent), I have first-hand experience since this happened under my watch.

The (not) infinitely scalable booking engine

A few years ago, we were rewriting a complex booking engine for a digital transformation initiative. This booking engine was at the heart of an ecosystem of loosely coupled systems. It listened to events from the ecosystem and would make cargo bookings.

Simplified Conceptual View of Main Components

The rewrite leveraged all modern architectural principles one expects to see. It adopted an event-first, cloud-first, serverless-first approach. It was replacing a J2EE monolith using JPA and a relational database. While the existing application supported all complex business rules and was exceptionally stable; it suffered from two issues. It didn’t have a robust set of APIs and didn’t interact over events.

To fix these shortcomings, we decided to rewrite from scratch. We adopted all North Star architecture guidelines and designed extensible data and event models. We even adopted an event-first approach to communicate between the various sub-components within the system to cater to future volume projections. The final deployment architecture consisted of several serverless functions, nano services, topics, consumers, and producers. A relatively simple CRUD application was beefed up to resemble the Hulk.

While no effort was spared, this was also the beginning of persistent nagging issues. Often events would be left unprocessed, incorrectly processed, or ignored. Each morning, the engineering team would run a reconciliation process and retrigger events to correct entries.

These production incidents didn’t change our behaviour. We didn’t take time to reflect or learn. Instead of tearing down the complexity, simplifying the architecture and making it easier to process an event, we built a debugging aid (web interface) to help engineers trace events and zero in on the root cause. This did not stop the steady stream of issues. Each release brought newer challenges. The rewrite did not improve confidence; on the contrary, it shook up engineering and management’s confidence.

IMO, this is a textbook case study for actively creating legacy code in the process of rewriting systems.

All is not lost. In my next instalment I will share an example where the engineers started winning over the fear factor one fix at a time and injected fresh life into a legacy system.

If you are interested in joining us on our journey, please check out our careers page.

--

--