Photo by Dil on Unsplash

How to regain control in a legacy codebase

Adrien Joly
Published in
13 min readJul 22, 2022


This article is a translation by Josian Chevalier of “Comment rendre un code legacy à nouveau maintenable

Have you ever avoided modifying a function out of fear of causing bugs? Gotten upset trying to understand how the source code of the project you just joined works? Had cold sweat approving the few code updates submitted by your colleague in their Pull Request?

If so, you have dealt with what we call legacy code. We use the term legacy because it generally is code that we inherited from another developer. The lack of documentation and automated tests make developers afraid to break it — without necessarily realizing it — when modifying it.

Hopefully, legacy code is not a fatality. We will see together a few software craft techniques to rebuild this code without starting over, in order to recover confidence and enjoyment in maintaining it.

I have been a web developer since 2006. I am currently helping a Startup restructure the source code of a data processing application. The team had to take a few shortcuts and now ends up losing hours of work due to the recurring appearance of bugs. On the side, I maintain Openwhyd, a music sharing application

I developed within a start up between 2012 and 2015 and open-sourced in 2016. Most of the examples given in this article come from either of these experiences.

Legacy is not a fatality

Too often, during discussions among developers, legacy code is mentioned as a source of disgust and disdain. Working on legacy code is seen as a punishment. The authors of such code are judged on their choice of now obsolete technologies, the lack of clarity and concision in the code, or the lack of automated tests.

“If a legacy codebase is still maintained, it means it is valuable”

Contributing to the development of a legacy codebase is, at first glance, less tempting than starting a brand new project based on the latest popular frameworks. But which one of these two really has the most impact? If a legacy codebase is still maintained, it means it is valuable. If it did not add value to its owner, they would have gotten rid of it.

The first time I was confronted with legacy code was during an internship. My mission was to extract a feature from a software application to a dynamic library, so it could be replaced by another. The main challenge was to achieve this without any documentation nor any contact with the authors of the source code. This experience taught me to define a strategy based on the formulation of hypotheses, then to verify them, methodically, one by one. At first I found this work quite difficult and off-putting compared to the development of a brand new software project. Eventually, the success of this mission brought me unsuspected satisfaction and improved my skills beyond expectations!

We are all guilty of having written legacy code!

Do you think writing legacy code is a decision? One sure thing: we all wrote some, because none of us is perfect. A SQL expert does not necessarily know how to write automated tests. A tester does not necessarily know how to name their variables. A startup CTO does not necessarily know how to get all his team to agree on common stack and tooling. We are all guilty of writing legacy code, so we are in no position to judge the code of others.

First contribution received on my open source project: a linter configuration!

When returning to the code I wrote a few years before, I suffered from the lack of clarity of my own code, the lack of automated tests and the lack of maturity I showed when authoring features that became unmaintainable. By open-sourcing it, I understood that maintaining my own code was not enough. The potential contributors have a near-zero comprehension of the product and its conception and neither them nor I want to spend hours discussing before integrating a bug fix. Case in point: the first Pull Request I received was neither a bug fix or a feature. It was the addition of a linter configuration and a Makefile! A sign that the definition of explicit norms and the automation of simple tasks are prerequisites to collaborating on a source code.

1. Installation, documentation and aligning the methodology with the team

Drawing from this experience, it was my turn to take the role of a “contributor”. The Signaux Faibles’ team was looking for this type of profile and skills so I integrated this project. I started by checking what was already in place: multiple git repositories, some documentation and a few automated tests.

To better understand their way of developing and their coding conventions, we organized recurring sessions of pair programming together. During these sessions, we repair automated tests, explore important parts of the source code and I get familiar with their technological stack. Each session ends with the submission of a pull request that we review together, and a retrospective to share our feelings on the session and to propose improvements for the next ones. We do our best to keep written traces of the choices that were made, before or during these sessions, so it can act as documentation.

“It is best to start working on a scope of limited size and risks”

It is best to start working on a scope of limited size and risks. On a component rather isolated from the rest of the application to avoid side effects as much as possible. It will be easier to rework the rest once these components are more solid and to use this first work as a template for the next ones, which will most likely be more complex.

Example of commits done in pair programming and TDD

So, we decided to start by writing a script aiming to automate a procedure that was taking a lot of time and attention. This script having no direct impact on the rest of the code implies a lower risk in case of mistakes. Its limited scope makes this project achievable in a short time, which is ideal for our first steps in pair programming, carried out in TDD. Then, we will focus on redesigning a set of JavaScript files used in a map-reduce operation performed by the MongoDB data server.

As I discuss with the team about the use of their software, I take notes on what I understand about the domain, the usage and operation of each repository, the common procedures… And I systematically ask for validation of these notes, whether I integrate them in the documentation repository of our project, to add comments in the source code or to build a glossary for myself.

For example, I like to take advantage of my fresh look at a source code to install it on my machine following the instructions provided by its documentation. Then, I offer to complete them if necessary. At Signaux Faibles, I went one step further and wrote an installation procedure in which each underlying service is launched within a Docker container. This allows making their installation more automated and portable. And incidentally, to avoid installing several versions of each service on my computer.

Installation procedures written during my onboarding

Following these first iterations, I have a better vision of the project, of the team’s way of working and I begin to see opportunities for improvement.

2. Securing your development through tests and monitoring tools

At this point, I understood the stakes for the project and the team, at least on the technical side. I have an overview of the existing source code, its documentation, and a cartography of the perimeters that would benefit the most from being reworked first.This does not mean that we are ready to get started on the redesign!

Testing in Continuous Integration

Knowing that a redesign can fail or cause anomalies that can be costly to people who rely on the proper functioning of the software, it is important to take precautions. To ensure that the software keeps working as expected, it must be tested throughout the redesign. To do this, we need to write tests of various granularity.

  • unit test for each component
  • integration tests with all the related components
  • functional tests (or end-to-end) covering the operation of the system as a whole.

To reduce the risk of oversights, these tests must be able to run automatically and on a regular basis.

To reduce the risk of oversights, these tests must be able to run automatically and on a regular basis. The most common way to do this is to set up a Continuous Integration (CI) pipeline that will execute these tests systematically, each time code changes are submitted by the team in the project repository.

Since the source code of Signaux Faibles is hosted in public repositories on GitHub, we can benefit from various CI solutions for free. I recommended using GitHub Actions, for its integration into GitHub (a tool already used by the team), the simplicity of implementing it, and its features which are more than covering our needs. We configured it to run tests anytime a Pull Request is submitted and any time a commit is added to it.

Execution of our tests in Continuous Integration

This allows us to confidently iterate on our development tasks, relying on the right passing of our tests as a safety net. If the tests don’t pass, it means the code that we suggested does not work as it should and it has to be modified again. We only merge the modifications once the tests are green and another teammate has read and validated them. The Peer review is important to catch errors that may have slipped through the cracks of our tests, but also to improve and align our coding practices within the team.

Test Coverage Monitoring

How to reduce the risk of bugs slipping through the cracks of our automated tests? It is very difficult to give a complete and foolproof answer to this question, but we can at least monitor which parts of the source code are not (yet) covered by tests. The important thing is to ensure that the coverage does not drop when integrating changes into the source code.

The important thing is to ensure that the coverage does not drop, when integrating changes in the source code

Coverage is measured by the test runner. For example, in the JavaScript/TypeScript ecosystem, Istanbul is commonly used. At the end of the test execution, the tool generates a file reporting the percentage of code that was executed by the tests, in each file of the source code.

In order to monitor the evolution of this coverage, we can generate this report each time the tests are run in the CI environment (and therefore, each time a modification is proposed in the source code), before forwarding it to a service that will notify us if the coverage drops.

We are currently trying the Codacy service to this end.

I want to clarify that reaching a 100% coverage rate is in no way a guarantee that your system is free of any anomaly, even if your tests are flawless. It is indeed impossible to simulate all the combinations of cases that your system could have to handle at some point.

Quality monitoring

How can the team measure the improvements brought about by the redesign? It is possible to monitor some indicators. For example, the evolution of the number of features delivered per week, the number of bugs encountered per week, the time spent on debugging, the level of satisfaction expressed by the developers, etc…

Beyond these indicators, it is also possible to follow metrics based on the analysis of the source code itself: cyclomatic complexity, dependencies update, compliance with the design good practices, etc…

The Codacy service allows us to visualize our progress: here, the drop in the number of anti-patterns.

But also, and above all, it is relatively easy to measure and automatically correct (in some cases) compliance with coding conventions chosen by the team, by using a linter and defining a configuration for it. For example, the linter will check that the code files comply with the team’s preferences in terms of indentation (ex: tabs or spaces? how many?), usage of semicolons after each statement, the casing of variables and function names, and so on…

Some programming languages ​​(ex: the Go language) have their own conventions and their own linters. Others, like JavaScript and TypeScript, are more agnostic. Therefore, it will be up to the team to define the rules to comply with. I suggested and implemented ESLint to enforce these rules in Signaux Faible’s JavaScript and TypeScript files.

Like test coverage, these metrics can be tracked and monitored by third-party tools, run from within our CI environment. We also use Codacy to this end.

3. Redesign Iterations

As I explained above, it is safer to redesign the source code gradually, starting with the components with the least risk of impacting the most critical components of the system.

Beyond the risk, we must also consider the impact of improving the components

On the other hand, beyond the risk, it is important to consider the impact of the improvement of each of the candidate components. Ideally, we would be able to start by redesigning a component that is relatively isolated from the others but whose usage and/or maintenance are currently very costly for the stakeholders.

The nature of this impact will help decide which redesign strategy to follow.

Tests as functional gatekeepers

Regardless of the strategy, it is essential to have at least a few automated tests, to make sure that our modifications will not cause functional anomalies throughout our redesign.

Ideally, you should have:

  1. end-to-end tests for the features of the system that could be impacted by the redesign;
  2. unit tests for the functions you (re)write;
  3. and, if necessary, integration tests using the components with which the redesigned component interacts.

At Signaux Faibles, the team had already produced a few unit and integration tests that compared the data resulting from the processing (through functions) with a reference result. This is called a golden master (a.k.a “approval tests”). For its lack of explicitness and precision regarding the expectations of each test, this method is a good start to assess our modifications’ potential impacts on the operation of the system. So we started by referencing these tests, fixing them when necessary, and making them easier to run.

From then on, we were able to run all the existing tests from the continuous integration pipeline. This allowed us to ensure that each subsequent modification did not cause any anomaly, at least on the use cases covered by these tests.

Example: Migration to TypeScript in TDD and pair programming

Based on this, we decided to carry out a gradual redesign, function by function.

The redesign of each function consists of:

  • making the data types explicit (input and output, by using the TypeScript language);
  • writing unit tests whose names will serve as functional documentation;
  • then refactoring the function to make it more readable, robust, and easier to test.

To avoid spreading ourselves too thin by trying to solve too many problems at once, we apply the TDD (Test-Driven Development) method . Except that we don’t go through the “red” step — writing a test that doesn’t pass — in cases where the feature under test is already correctly implemented.

To move forward confidently, we mainly work in pair programming: one of us dictates to the other the changes he would like to make, then we switch roles every 10 minutes. This working arrangement also allows us to align ourselves explicitly and immediately on preferences and technical choices. In particular: how to name functions and variables.

Together, we make the manipulated “domain objects” explicit and describe the functions.

At the end of each session (usually 2h30), we submit a Pull Request, take the time to describe the changes made, observe the results of the continuous integration pipeline (including the coverage and quality indicators provided by Codacy) then to merge these modifications in the main branch of the repository, when these are complete and functional.

The strategy that we have decided to follow fits Signaux Faible’s expectations on the scope in question: to prevent data manipulation errors during the execution of the data processing pipeline, knowing that it lasts several hours. It is not necessarily a model to choose for your own redesign project, but allows us to lay down a concrete example, and to share the reasoning that we followed in making these decisions.

Legacy code, an exciting challenge!

I hope I have managed to convince you that “legacy code” is not inevitable, and that redesigning legacy source code can be a technical challenge that is just as exciting and rewarding (if not more) than working on new source code.

I stress out that the techniques and tools mentioned here were chosen according to the characteristics of the existing system, the team’s constraints and preferences (including mine), and our knowledge at the time. I in no way advocate following this example to the letter. On the contrary, I hope that you will retain from this article the reasoning that we followed to adopt an approach.

How about you, have you ever worked on the redesign of a legacy code? Looking forward to reading your story!

I would like to warmly thank my excellent pair programming partner, Pierre Camilleri, and the entire Signaux Faibles’ team for the support and trust they placed in me during this mission. Thanks also to Fabien Maury, Elodie Quezel and Laury Maurice for helping me write and improve this article.
To finish, thanks to Josian for his translation work !



Adrien Joly

👨‍💻 Software crafter @SHODO, legacy code / tech debt doctor ( 🥁 Drummer of “Harissa”, VR lover, music digger