Tackling Legacy Code: How we Accelerate Code Migration at Agoda with Metaprogramming
As software engineers, we often strive to build scalable and efficient systems. Over time, however, the architectures we initially craft may become outdated due to technological evolutions or changing business requirements.
So, what do we do with code that has reached its “legacy” status? At Agoda, we’ve strategized a unique approach that enables us to deal with this challenge; rather than a complete overhaul, we adopted a step-by-step, less disruptive approach — metaprogramming — that uses tools like JSCodeshift to modernize our codebase.
Meta What?
First, let’s discuss metaprogramming. Metaprogramming is a method of programming where we write code that manipulates other code. Metaprogramming enables us to view and alter our program structure as it runs. We harness this power to update our legacy code without losing sight of the larger architectural picture.
Meet JSCodeshift
To assist our metaprogramming pursuits, we leverage JSCodeshift, a toolkit developed by Facebook, to run large-scale JavaScript code modifications. What sets JSCodeshift apart is its ability to traverse through our JavaScript code in an Abstract Syntax Tree (AST) format.
Abstract Syntax Trees (ASTs)
An Abstract Syntax Tree (AST) is a tree representation of the structure of the source code. Each node of the tree denotes a construct occurring in the source code. Navigating through this tree helps extract valuable information about the code structure, and more importantly, it assists in identifying and replacing code fragments systematically.
By using ASTs, JSCodeshift allows us to navigate through the structure of our code, going beyond simple regex string substitutions and offering deeper and more accurate transformations. As we navigate the AST, we can specifically pinpoint the legacy code and replace it with newer implementations.
Fun Fact — MetaProgramming and AST in Your Daily Use
Here’s an interesting tidbit: As JavaScript developers, you’ve likely used tools that leverage Metaprogramming and AST without realizing it! These methodologies have been implemented in various popular tools and practices.
- Take Babel, for instance, the JavaScript compiler that transforms your latest JS syntax into a backward-compatible version for current and older browsers or platforms. Babel uses AST to parse your code into a syntax tree, transform it, and generate the compliant version.
- Similarly, ESLint, the tool that helps identify and report potential code errors, harnesses the power of AST to analyze the code’s structure, enforcing certain coding conventions and avoiding bug risks.
- Prettier, a code formatter, uses AST to parse your code and reprint it with its own rules, taking line length into consideration and delivering a consistent style by parsing and reprinting your code.
- Lastly, the code-editing features provided by your Integrated Development Environment (IDE) are also probable users of AST. IDEs like Visual Studio Code and JetBrains’ WebStorm use AST under the hood to understand the hierarchical structure of your code and provide features like code navigation, refactoring, and intelligent code completion.
So, while coding in JavaScript, you have been tapping into the power of Meta Programming and AST more often than you might have imagined! As you can see, these technologies do not just remain exclusive to large-scale migrations or complex system developments — they’re well-integrated into the fabric of our everyday coding life.
Deep-Diving into Agoda’s Problem Context
At Agoda, our FE-DATA team has diligently developed an analytics framework for web and app platforms. This framework communicates with our internal system, the Generic Collector, through three types of messages. These messages are then sorted and stored in three different internal storage systems. For this blog, we’ll focus on the ‘Generic’ message type, which we use to track user interaction events.
In the initial stages of our analytics framework, we developed two JavaScript libraries: @agoda/messaging-client (mcjs) and @agoda/analytics-data-acquisition. These libraries were designed to send messages from the web to the Generic Collector and to provide an API for our developers to easily track user interaction events, respectively. However, we soon realized that our approach had some limitations.
One of the main challenges we faced was the lack of a proper catalogue of tracked events. Sometimes, the event context field contained unexpected values, and there were sets of event context fields that had to be manually set for each individual event. We introduced two new concepts to address these issues: Context Separation and Event Definition.
Context Separation allowed us to categorize event context fields into three levels: App, Page, and Event. This meant developers had to set up their app context and page context before calling the track function, ensuring that all events contained the necessary context fields.
Event Definition, however, requires developers to create an event definition before sending any events. This definition would specify the event’s owner, the page/application it was sent from, the fields included as event context, and the expected data type of each event context field.
However, migrating over 1000 events and creating event definitions for each one was a daunting task. During this period of manual migration, we discovered a pattern that allowed us to streamline the process using metaprogramming.
Building the Codemod CLI
Ironically, the two new concepts we introduced, although improving our code quality, also pose a new problem. Now, every event needs its associated definition, and we need to specify which page and application it should come from. Figuring out the application it is coming from is easy, but it is a whole different story for the page. We often find it hard to determine the page from which the event is sent simply by looking at the code.
Recognizing the occasional need for developers’ input due to the inherent subjectivity of the code, we’ve built the Codemod CLI. Rather than executing immediate alterations, our tool lays comments above the identified code, inviting developers to supply the necessary information. Once the developers provide their input, another run of the CLI successfully finalizes the migration.
Real-Time Visibility and Monitoring
Our dashboard provides live updates on the remaining legacy code. It’s powered by a scheduled job that clones the repository, operates the CLI, aggregates the data, and pushes it into a Hadoop table. This advanced integration promotes much-needed transparency, making the hunt for and elimination of outdated code more manageable.
Preventing the Spread of Legacy Code
Recognizing that migrating legacy code takes time, we’ve implemented a custom ESLint rule to curb the spread of older scripts in our codebase. This rule helps detect when developers resort to legacy code, offering actionable insights and alternatives to more suitable practices.
The Results — Speed and Accuracy
Our approach has significantly amplified the pace of our migration process — specifically, it has made it 3.5 times faster than manual methods. Moreover, given that developers need to conquer a learning curve to manually migrate code effectively — often leading to expensive errors — this method is potentially more efficient. Our CLI tool eliminates this learning process and its related errors by automating most updates, enabling developers to hit the ground running.
Conclusion
Innovative tools such as Codemod CLI and JSCodeshift, coupled with strategies like metaprogramming and custom ESLint rules, have changed our approach to code migration at Agoda.
However, these advancements merely touch the surface of metaprogramming’s potential, with possibilities of more efficient code generation tools and the prospect of a ‘no-code’ platform on the horizon.
Empowering product owners and designers to directly implement small yet powerful changes to our product without the need for a developer at every iteration would expedite the design process and foster a seamless, participative approach to product development.
We remain committed to exploring and refining these tools and strategies, streamlining our codebase, and revolutionizing the software development process.