Testing Already-Written (Legacy) Code

Writing tests is hard.

At Quartet, our mission is to build a collaborative mental and physical health ecosystem to allow every person in our society to thrive. We’re attempting this on the web using a modern web application stack and agile processes. Despite all the learning our agile process gives us, we often find ourselves in situations where experimental code hangs around longer than expected and makes its way into production with no test coverage.

There’s a lot of great information on how to write tests as you write new code, but not as much devoted to adding tests to legacy code. In this post, I’ll do that using JavaScript examples. I’m going to take a pragmatic approach; the goal is to add more tests than you have now to build confidence in your system. Some bad tests are better than no tests at all.

The long-term goal of testing is increased confidence in code by asserting that the correct behavior is happening and by documenting that behavior with code that is run often (i.e. an automated test). The short term goal of testing is establishing a baseline for the behavior for modules, which should allow us to change implementation details with confidence.

Essentially: Write tests for the live code, then refactor that code against those tests. If code changes and the tests pass, we have confidence we haven’t changed behavior.

The most difficult and time-consuming portion of adding tests to legacy code is usually the initial hump. You’re usually forced to change the structure of code and add abstractions. Not coincidentally, it’s the easiest portion to decide that you’re not only going to not clean up this crap, but you’re quitting tech altogether to become a corn farmer and you’re getting a straw hat and overalls.

Accept the risk and stay with it. Remember, the goal is increased confidence in your system, and that is often well worth the cost.

I’m going to show some examples from a recent project at Quartet. I added some tests and refactored some legacy code that was written under experimental conditions and would have to be used more heavily in the future. This is client-side code that makes requests and manages responses from the server, so it’s in our interest to have this well understood and tested to ease refactoring as we nail down the architecture of our young systems.

The file was organized as a group of single functions that were exported as a package (we’re building our front-end code with gulp, webpack and babel in a node environment). Here’s an example of a couple of the functions:

A snippet of the original file

Before I started refactoring, I tried to get a sense of the inputs and outputs and dependencies of the methods. These two methods both take in some domain data, return a promise, and depend on assert, Promise, and superagent. Taking a look at the interface of the module as a whole, it looks like it returns an object with the names of methods as keys and the methods as values using ES6 object shorthand.

The biggest issue with testing these methods is the new keyword on lines 8 and 24. Instantiating a new object in a method means it won’t be easy to mock its functionality. Sometimes that’s not a big deal. In this case, because the code being instantiated makes requests across the wire, it should be mocked.

This is where the pragmatic part comes into play. It’s going to take some judgement to know the best course of action to take. There are purists who would say that you absolutely should not refactor the code until you have some tests on the behavior of that code. There are cowboys who’d say let it break and we’ll add tests to the broken edge cases. You’ll have to find the right approach for you.

Since I was familiar with testing the behavior of the system via the actual product features (i.e. clicking around on the website), I felt confident factoring out the object instantiation, then testing the affected parts of the system. Here’s what it looked like after the change:

First refactoring. Assume config and QuartetLogger exist

The request and error handling have been abstracted from the individual functions. Their new responsibilities are ensuring the data exists before being used, and delegating to the proxy to handle requests. The functions are wrapped in object literals to provide more structure, and the proxy is now a member of the object literal (which I plan to mock in the tests). The proxy still has a new object being instantiated in it, but this change makes this module much easier to write tests for.

After running through the features to make sure everything worked, I committed this code then started on the tests. When thinking what to test, I looked at the exit conditions of a function. In both of these functions, non-blank lines coincide with exit conditions. What I want to do write tests which assert the output based on input I provide. In this case, it’s easy: an error is thrown if the values being asserted are falsy, so to test the error paths, I want to null out each tested value. A request is made when all the data are present, so I want to spy on that method and assert it gets called when testing the good path.

A snippet of an ugly, but working test for registerNewUser

The interior describe block shows two of of the possible five exit conditions. When asserting errors are thrown, chai’s expect takes a function which it calls. Because registerNewUser takes an argument, I wrapped its invocation in an anonymous function defined in the ES6 fat arrow style, then passed that to expect.

The assertion has a gnarly homemade stub on line 49 that it uses as the context for the function call for this test. On line 33 and 43 in the testable-session snippet above, this refers to newContext defined here. This should be cleaned up in the future (most likely with some kind of dependency injection into the constructor), but it’s fine for now; the goal is building confidence in the system through tests.

Both these tests pass, and changing the inputs makes them fail in ways I would expect. Feeling confident in this solution, it’s now a matter of adding similar tests the rest of the functions in the module, which I won’t show here.

The final task is adding tests for the proxy object to establish a good baseline of behavior. To make this cleaner, I moved the proxy into its own file:

There are two parts of this code that have the potential to interfere with the ease of testing: the instantiation of a new object on line 5 and the use of a local lexical variable to store superagent (instead of an instance variable for example). A fix for that:

superagent and logger have become instance variables which will allow simpler stubbing of functionality, and promises are returned from static method calls to resolve and reject, rather than wrapping a superagent request. With things in a slightly cleaner state, it’s time to add tests.

The first 27 lines are setup, with a homemade stub occupying 12–27. Similar to the service test, these methods are being called with a hacked up mock-context that should be refactored and cleaned up later.

There are two main tests for the post method: one asserting the behavior of an error path and one asserting the behavior of a good path. The two tests are simple and reflect the simple nature of the code they’re testing. The success test asserts that the an info message is logged with the resolved value. The failure test asserts that an error message is logged with the rejected value.

There’s no correct answer to the question of “at what point is my behavior sufficiently tested?” Here, I’m happy asserting that messages are logged. Some people might want to assert that promises are returned or that values are properly rejected or resolved. It’s up to you and what you’re trying to accomplish. I’m trying to gain more confidence in this system, so this is enough for now. These tests will likely grow as more edge cases are revealed and more functionality is added to the proxy. They should be refactored when there is more time, but they are completely usable now.

This is a good first step into getting a clean suite of tests. Keep in mind that some bad tests are better than no tests. Some the things to remember when writing tests for legacy code:

  • be pragmatic; the goal is confidence
  • pay attention to inputs and outputs of functions and modules
  • abstract code that is hard to test