Let’s Define Legacy Code

Sarthak Makhija
6 min readApr 12, 2018

--

“I have been having sleepless nights trying to add features in the code we acquired from other company. I am dealing with purest form of Legacy Code

“I am having a real hard time dealing with tangled, unstructured code that I have to work with but I don’t understand a bit. Legacy Code !

Legacy code is a term which probably has a lot of different definitions like -code acquired from someone else, code written by someone else, code that is hard to understand or code written in outdated technologies. Whatever be the definition, most of us believe Legacy Code is Scary.

Question> How would you define legacy code?

Defining Legacy Code

Michael Feathers in his book “Working Effectively with Legacy Code” defines legacy code as, code without tests.

Code without tests is a bad code. It doesn’t matter how well written it is; how well structured it is; how well encapsulated it is.Without tests there is no way to tell if our code is getting better or worse.

Well, a slightly modified version of this definition is “code without unit tests is called legacy code”. It is always better to have tests as close to the code as possible (unit tests > integration tests > UI tests). So, it would not be unfair to call a code without unit tests a legacy code.

Working with Legacy Code

Question> What approach will you take if you were to make a change in legacy code?

Most of us might say, “I will make the change and call it a day, why bother about improving the code”. Rationale behind this thought process could be -

  • I don’t have enough time to refactor the code, I would prefer making a change and completing my story
  • Why risk changing the structure of the code that has been running in production for a long time
  • What is the overall benefit of refactoring legacy code

Michael Feathers calls this style of making a change as Edit and Pray. You plan and make your changes and when you are done, you pray and pray harder to get your changes right.

With this style, one can only contribute to increasing Legacy code.

There is a different style of making changes which is Cover and Modify. Build a Safety Net, make changes in the system, let Safety Net provide feedback and work on those feedbacks.

It can be safely assumed that Cover and Modify is a way to go to deal with Legacy code.

Question> But, should you even spend time writing tests in legacy code or even thinking about refactoring a legacy code?

Should you even spend time thinking about refactoring Legacy Code ?

The Boy Scout Rule

The idea behind the Boy Scout Rule, as stated by Uncle Bob, is fairly simple: Leave the code cleaner than you found it! Whenever you touch an old code, you should clean it properly. Do not just apply a shortcut solution that will make the code more difficult to understand but instead treat it with care. It’s not enough to write code well, the code has to be kept clean over time.

We get a very strong message when Boy Scout rule is applied to legacy code “leave a trace of understanding behind you for others to follow”, which means we will refactor the code to make it more understandable. And in order to refactor, we will build Safety Net around it.

Now that we understand we can not take shortcuts the only option that is left with us is to write some tests, refactor code and proceed with the development. Questions>

  • Which tests should we write?
  • How much should we refactor?

Which Tests To Write

In nearly every legacy system, what the system does is more important than what it is supposed to do.

Characterization Tests, the tests that we need when we want to preserve behavior are called as characterization tests. A characterization test is a test that characterizes the actual behavior of a piece of code. There’s no “Well, it should do this” or “I think it does that”. The tests document the actual current behavior of the system.

Writing Characterization Test

A Characterization Test by definition documents the actual current behavior of the system the exact same way it is running on Production environment.

Let’s write a Characterization test for a Customer object which generates text statement for some movies rented by a customer.

import static com.code.legacy.movie.MovieType.CHILDREN;
import static org.junit.Assert.assertEquals;
public void shouldGenerateTextStatement(){ Customer john = new Customer("John");
Movie childrenMovie = new Movie("Toy Story", CHILDREN);
int daysRented = 3;
Rental rental = new Rental(childrenMovie, daysRented);
john.addRental(rental); String statement = john.generateTextStatement();
assertEquals("", statement);
}

This test attempts to understand (or characterize) the “Text Statement” generation for a customer given a children’s movie rented for 3 days. Because we do not understand the system (at least as of now), we expect the statement to be blank or containing any dummy value.

Let’s run the test and let it fail. When it does, we have found out what the code actually does under that condition.

java.lang.AssertionError: 
Expected :""
Actual :Rental Record for John, Total amount owed = 12.5. You earned 4 frequent renter points.

Now, that we know the behavior of the code, we can go ahead and change the test.

import static com.code.legacy.movie.MovieType.CHILDREN;
import static org.junit.Assert.assertEquals;
public void shouldGenerateTextStatement(){
String expectedStatement = "Rental Record for John, Total amount owed = 12.5. You earned 4 frequent renter points";
Customer john = new Customer("John");
Movie childrenMovie = new Movie("Toy Story", CHILDREN);
int daysRented = 3;
Rental rental = new Rental(childrenMovie, daysRented);
john.addRental(rental);
Sting statement = john.generateTextStatement();
assertEquals(expectedStatement, statement);
}

Hold on, did we just copy the output generated by the code and placed into our test. Yes, that is exactly what we did.

We aren’t trying to find bugs right now. We are trying to put in a mechanism to find bugs later, bugs that show up as differences from the system’s current behavior. When we adopt this perspective, our view of tests is different: They don’t have any moral authority; they just sit there documenting what system really does. At this stage, it’s very important to have that knowledge of what the system actually does someplace.

Question> What is the total number of tests that we write to characterize a system?

Answer> It’s infinite. We could dedicate a good portion of our lives to writing case after case for any class in a legacy code.

Question> When do we stop then? Is there any way of knowing which cases are more important than others?

Answer> Look at the code we are characterizing. The code itself can give us ideas about what it does, and if we have questions, tests are an ideal way of asking them. At that point, write a test or tests that cover good enough portion of the code.

Question> Does that cover everything in the code?

Answer> It might not. But then we do the next step. We think about the changes that we want to make in the code and try to figure out whether the tests that we have will sense any problems that we can cause. If they won’t, we add more tests until we feel confident that they will.

How Much To Refactor?

There is so much to refactor in legacy code and we can not refactor everything. In order to answer this we need to go back to understanding our purpose of refactoring the legacy code.

We want to refactor legacy code to leave it cleaner than what it was when it came to us and to make it understandable for others.

With that said, we want to make the system better keeping the focus on the task. We don’t want go crazy with refactoring trying to rewrite the whole system in a few days. What we want to do is “refactor the code that comes in our way of implementing any new change”. We will try and understand this better with an example in the next article.

Conclusion

References

  • Working Effectively with Legacy Code

Let’s take an example of Legacy Code and deal with it professionally in the next article Let’s Deal with Legacy Code

--

--