Properties of Code: Safe refactoring without tests

In algebra there are certain transformation of a function that are mathematically equivalent. You may remember names like “The Associative Property” or “The Commutative Property” of equality. These are the rules and recipes used to transform something that is complex and hard to understand

To something a bit more familiar and easier to work with

Not in one step of course, but in many small step transformations that each maintain equivalency. This sounds a lot like refactoring

“Refactoring (noun): a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behavior” (Fowler & Beck, 1999, p.53).
“Refactoring (verb): to restructure software by applying a series of refactorings without changing its observable behavior” (Fowler & Beck, 1999, p.54).

The similarity between refactoring and mathematic transformations doesn’t stop there. But before we go into that let’s do a quick review of some 8th grade algebra you thought you’d never have to see again.

Identity Properties

We’ll start with something easy that we can all agree on. If we add 0 to any number the result is the same number. This is called “The Additive Identity” and it has a twin “The Multiplication identity” which states that if you multiply 1 by any number the result is the same number.

a + 0 = 0 + a = a
b(1) = 1(b) = b

Nothing groundbreaking here. Except a proof that it’s possible to add an element to an equation without changing the results! With that we have enough to start explaining the 6 Core Refactorings and how they can be used to safely change code without tests.

Adding a Local Variable

Let’s start with the refactoring equivalent of “The Additive Identity”, adding a local variable. Starting with the following code

void foo()
{
    bar(a + b + c);
}

We can refactor the code to:

void foo()
{
    var d;
    bar(a + b + c);
}

Does anyone think we have changed the observable behavior? [1]

Adding an unused variable, method, class or a comment will maintain equality. All are variations on the additive identity and the resulting observable behavior is the same behavior.

The Associative Property

Our next property of equality deals with grouping. “The Associative property” states that changing how numbers are grouped, whether being added or multiplied, doesn’t change the result.

a + (b + c) = (a + b) + c
a * (b * c) = (a * b) * c

Again, this is standard stuff. There is a little bit of hidden complexity here in that we can get into trouble if we cross the addition/multiplication boundary.

a * (b + c) != (a * b) + c

That situation is covered by another property. But if we’re careful about what we include in our group we’ll be fine.

Introduce Explaining Variable

To see how this might apply let’s assign our variable to something, such as a piece of the complex expression inside the call to bar()?

void foo()
{
    var d = a + b;
    bar(a + b + c);
}

This is the first step of the “Introduce Explaining Variable” refactoring (Fowler & Beck, 1999, p.124).[2] Or, rather, it would be if we’d used a name that explained anything. We’ll deal with that in a bit. This group of refactorings (extract variable, field, parameter, or method) are a way to group information as logical chunks.

The Substitution Property

To take advantage of that grouping we can look at a third property of equality. “The Substitution Property” states that if two values are equivalent we can substitute one for another in an expression.

if a = b
b + c = a + c

Introduce Explaining Variable cont.

We apply the substitution principle in the next step of our refactoring by replacing the original use of the complex expression with the local variable.

void foo()
{
    var d = a + b;
    bar(d + c);
}

We’ve just completed the refactoring. The code was able to execute at each small step, and behavior was preserved. If we would have had bugs in our code, we would have maintained bug-for-bug compatibility. That’s a pretty high bar for “no changes to observable behavior.”

Rename Variable

We’ve shown it works for one refactoring. What about the others?Unfortunately, the variable we added is not very intention revealing about how it should be used. We can fix that with the refactoring “Rename Variable.”

The first step is to create a variable with the new name in the same scope.

void foo()
{
    var totalPrice;
    var d = a + b;
    bar(d + c);
}

Then, copy the old value of the old variable to the new variable

void foo()
{
    var totalPrice = a + b;
    var d = a + b;
    bar(d + c);
}

Change the value of the old variable to the new variable

void foo()
{
    var totalPrice = a + b;
    var d = totalPrice;
    bar(d + c);
}

Find all references to the old variable and change them to the new one

void foo()
{
    var totalPrice = a + b;
    var d = totalPrice;
    bar(totalPrice + c);
}

Finally, remove the old variable

void foo()
{
    var totalPrice = a + b;
    bar(totalPrice + c);
}

That was a lot of really tiny steps! But again, no step put the code in a non-working state or changed the observable behavior. This is made significantly easier and faster with automated refactoring tools. These refactorings can be done safely in seconds, and with less possibility of human error.

Conclusion

We showed micro steps for 2 of the 6 core refactorings and reviewed a few math properties of equality (there are 8 and we’ve only used 3). If we were to follow an example of “Inline” (variable, field, parameter, or method) we could add a review “The Transitive Property” (if a=b and b=c then a=c). However, I think my point has been made.

BTW: It was not to show you how to do these reafactorings by hand (please don’t). I broke down these steps to make them visible and show how larger refactorings are based on small steps, each safe, each mathematically equivalent, making the whole safe. When used in a disciplined way they are quite possibly the safest way that you can interact with code.

Early on it was suggested that tests were a prerequisite to refactoring (Fowler & Beck, 1999; Feathers, 2004). According to Fowler & Beck (1999)

“If you want to refactor, the essential precondition is having solid tests. Even if you are fortunate enough to have a tool that can automate the refactorings, you still need tests” (p.89).

I believe this was stated out of a misplaced belief in the efficacy of tests to prevent bugs from entering a system when making these changes. I would like to state unequivocally that tests are neither necessary nor sufficient to prevent bugs. This is true for the same reasons that smoke alarms are neither necessary nor sufficient to prevent fires. They can at best tell us after the fact that it has occurred. That isn’t to say that I don’t write tests or use Test Driven Development (TDD). I do (I also make sure my smoke detectors are functioning). Tests have many benefits for design, communication of dynamic behavior, and code safety. Not the least of which is early detection and keeping bugs from spreading.

However, the real magic for safely transforming software, and preventing bug in the first place, is in disciplined refactoring. Refactorings can be proven to not introduce bugs. William Opdyke stated as much when he wrote “Fortunately, determining whether a refactoring is safe is often trivial, especially for the low-level refactorings that constitute most of our refactoring … By showing that each step of a more complicated refactoring is safe, we can know by construction that the refactoring is safe” (Folwer & Beck, 1999, p.392).

We should be embracing tools and practices such IDEs with automated refactoring, Read-by-Refactoring, and naming as a process which provide real safety and are both necessary and sufficient to prevent bugs.

Notes

[1] There are those who will argue that this changes performance and therefor behavior (probably not given how modern languages work). However, I would put forth that there is a subtle but important difference between optimization and refactoring. I think Michael Feathers said it best ‘Optimization is like refactoring, but when we do it, we have a different goal. With both refactoring and optimization, we say, “We’re going to keep functionality exactly the same when we make changes, but we are going to change something else.” In refactoring, the “something else” is program structure; we want to make it easier to maintain. In optimization, the “something else” is some resource used by the program, usually time or memory.’

[2] This is more often referred to as “Introduce Local Variable” or “Extract Variable”

References

Feathers, M. (2004). Working effectively with legacy code. Prentice Hall Professional.

Fowler, M., & Beck, K. (1999). Refactoring: improving the design of existing code. Addison-Wesley Professional.