Keep your code refactorable

Growing confidence in the software you are building

Gerrit Stapper
NEW IT Engineering
13 min readMay 10, 2021

--

An image showing a circular process of three steps to change software: Read — conclude — adjust. At the center is confidence that the developer needs during all the three steps to make the respective change properly
3 step process of refactorable code with confidence

In todays software engineering world, we want to move quickly. Get features out there to receive early feedback. React to changing requirements to build the application that is actually required, not the one that was initially described. Learn and improve during the implementation and re-apply these lessons on older parts of the application.

Our code must be able to cope with this demanded velocity and flexibility, otherwise we have a big problem: we lack in reaction speed, ideas and todos start to pile up, technical debt and problems start to grow.

To achieve the highly demanded agility at the core of our application, we need to keep it refactorable. Martin Fowler describes a refactoring as “a change made to the internal structure of software […] without changing its observable behaviour”. For me, refactorable describes a state of the application code can be improved and extended at any time. Note, that this does include adding new business requirements (that is the extended part), whereas Fowler explicitly excludes this, because that would change the “observable behaviour” of the software.

My goal with describing refactorable code is to get a sense for and a description of the atmosphere and context of a well-crafted software product.

What is refactorable code

As said in the definition above, being refactorable is a state of the application: Either it is refactorable or it is not. That means, given the current application code, if I add more code to it I have to pay attention to certain aspects in order to ensure it ends up staying refactorable. This is why the model you see at the top is circular.

The three phases each change to the code goes through are:

  1. Reading code
  2. Making conclusion on what needs to change
  3. Making the change

Further, there is an inner part of the model: the context or atmosphere. Let’s explore each of these four parts in a bit more detail.

Reading code

Whilst initially with every new software product, there is basically no code at all, this quickly changes. At some point we might read a lot more code than we write, because news things must fit into the existing architecture (or that must change) and thus intertwine with the rest of the application.

So with every change I want to make I first need to read code. And this is where the first information from the original author flows to me — the reader (it could be the same person after all). The better the structure of the code, the easier it is for me to find a path through it to the point where I need to make my change (or believe to do so). The better the code is written, the easier it is to conceive.

In total, the reading step is about the What: what do I see in front of me, where am I and where do I need to go.

Making conclusions

Coming from the What to the Why is all about making conclusions. From what you saw while reading the code, you now make thoughts on why the code is the way it is. You try to understand the business requirement the code in front of you fulfils at a certain place (could be a class for example).

Here, we drill down to the spot where we are actually going to make our changes and a good understanding is crucial to do this without destroying what the application is currently capable of.

Making the change

With our knowledge of what happens why we are now ready to make the actual change happen. This could, for example, be a bug fix, the implementation of a new requirements or a performance improvement.

Here is also, where the circle closes and starts from new: With every change, we must think about the next to two phases to follow immediately after — either for you or someone else.

The context

Whatever we are surrounded with influences how we go through the three phases, especially the third when we make the changes. Think about a project with a tough deadline, a lot of pressure, stressed out people. You will most probably not invest as much carefulness and thought into how you craft your code, but have the business requirement as the first and highest goal. That means, the next reader will be impacted by what is produced in the current context and the wheel starts spinning — downwards! Over time, the code will succumbed to this context and start to rot, which can prevent people from contributing at all due to a fear of breaking yet another thing.

However, if the external influence is confidence on the other side, that can have a significant positive impact. If the code is written well enough, you are far more confident that you actually understand whats happening where, you have far more confidence that you actually know why all of that is happening and thus you can make the correct change with a lot more confidence, too. In total, over time, your iterations through the three phases will become faster and faster, people have better chances being onboarded quickly and your teamwork improves through a common understanding: a positive snowball effect!

Obviously, the confidence could be a false positive and make your blind for problems underneath. But we will explore how refactorable code can be achieved and with that hopefully reduce the risk of false positives to a minimum.

How to achieve refactorable code

Next, we will have a closer look at how we can actually get to the point of having refactorable code and then stay there. We will do so by moving through the three phases again — step by step. For each phase, I want to touch on different topics that I think in total help to finish each phase with as much confidence as possible.

Achieving readable code

When talking about readable code, we immediately need to bridge across to writing readable code. But it’s also a topic of the mindset and strong communication within the team.

Tim Ottinger said “[…] programming is a social activity” in the Clean Code book by Robert C. Martin from 2009, when he was talking about naming things like classes, methods, functions or variables. It’s the names can or cannot express meaning, transport a goal to the reader, help in discussions to actually talk about the same matter and overall to find a common understanding of the business problem at hand. This goes into the direction of a ubiquitous language as Eric Evans describes it in Domain Driven Design. If each and every developer finds new terms for existing things, the team won’t align, new joiners will have a hard time matching these synonyms and bugs due to misunderstanding are almost inevitable.

Now, this common language is a general thing to have. You’ll apply it once you start writing code. And this is the bridge I was mentioning initially: The more care someone takes when writing code, the easier it is to read it. Put yourself into the perspective of others in your team, that did not work on this particular code before, maybe not even this repository yet: Do they have a chance to understand what’s going on? The metaphor I am aiming for is that I want to push the information to the reader, instead of making the reader having to pull it out of the code. The idea behind that is, that I as the author of the code, have all the context, all the information — the reader does not. This makes it a magnitude harder to pull the information as if I were just to push them into the code.

I don’t want to advocate for a certain technique here, but instead highlight this direct link of writing code, reading code and working as a team when building applications. The link is undeniable and the better it is managed, the more efficient the team will be. What I want to do however, is to advocate against a certain mindset that I find to be very harmful: it works — I’m done! The following image of a house is the perfect visualisation of that mindset:

Image of a house that clearly shows many problems: the chimney is off, the rain pipes are not properly connected, the entrance is in the first floor and only reachable via a ladder, that ladder blocks a window. This house is a metaphor for an insufficient job and a mindset of “it works”, which is not enough
An insufficient house: it works is not enough (Image by Holger Zahnleiter from Accenture)

The rain pipe is off, the ladder to the entrance blocks one window, there’s going to be some water problem soon, the chimney is off. In total, it is a house, you could sleep in there and stay dry (at least for a while), but it probably won’t stand many winters. Coming back to the world of software: Hacking down some lines of code, checking if the code does not break and immediately pushing the code into the remote repository is exactly what produces these kinds of houses. There is no second thought, no polishing, no refactoring, no further tests, no documentation — nothing to help the next reader having a fair chance making her or his next step. And from there, things will just get worse. This mindset really puts the success of the team at risk as it works directly opposed to all the aspects mentioned above. It works is simply not enough!

Allowing meaningful conclusions

References the introduction of the conclusion step from above, the main focus here is to understand why the code is how it is: what requirement it is solving, what bug is it fixing, what performance boost is it giving. Although, our code is expressive by now and well thought through, I still need a layer on top to highlight interactions and explain them in natural language. There are two things that help me here:

  • Tests
  • Commit history

I will start off with tests, in particular with unit tests as I have the most experience with that kind of tests. Let’s look at the following Java example of a unit test to verify some exchange rate calculation. Every test comes with four major pieces of information:

An image showing a java unit test code example with visual hightlights for the method name of the test as the test case description, the input parameters for the method under test as the inputs, the output type and name as the outputs as well as the assertion as the expected behaviour.
Different pieces of information delivered through unit tests
  • The test case description through the method name of the test
  • The parameters of the test case through the parameters of the method under test
  • The actual output of the code through the return value and data type of the method under test
  • The expectations on the test case through the assertion verifying the expected value against the actual

The aspect I want to highlight here is that these pieces of information come from the author, someone put them there. Someone wrote down her or his knowledge of the requirements (the expected behaviour) and then setup a scenario that at least partially verifies this requirement is covered. Having multiple tests, their descriptions give a great overview of what is to be expected by this code part. The author made her or his implicit knowledge explicit by writing it down. And now that we have those tests, someone else can easily go ahead and execute them, learn if there is something wrong and in case there is a problem, have a description of what is expected instead. This is a great helper when trying to understand code and what it should do.

Another thing I pay more and more attention to is the change history of my code. With tools like git, every commit consists of a number of changes in one or more files. Git makes these changes visible. What git can however not do for you, is to make the connection between them, to describe them as one unit of change, to associate them on a higher level to achieve an overarching goal: implementing a new requirement, fixing a bug, doing a refactoring. The responsibility of making this visible is with the author: She or he adds the why (things changed) on top of the what (changed), which git gives you for free. A commit message is a great way express intention, faced problems, chosen alternatives and many more things in natural language: Ever had to use a more complicated solution, because you are not using the latest JDK in your company? Over the course of multiple commits, you can always track these changes and the thinking behind them back. On top, all IDEs today can show you the matching commit & commit message for a line of code: That is something like a code comment, but much more powerful as it bridges all changes of that commit. The comment stretches the whole change, instead of being just in one place. Additionally, whenever the same line of code is changed, there will be a new message and thus a new “comment”. They change over time: The chance for them to rot is significantly lower than with traditional code comments (for a deeper discussion on this idea, see my other post: Commit Messages over Code Comments).

In both cases, with tests as well as the commit history, we again rely on the step of writing code. However, it is yet another facette of things to consider when being the author, yet another impact we have on the whole team. But it also shows what great impact good code can have!

Making the right changes happen correctly

Now we can finally talk hands-on coding! While producing code there are two techniques that I find very helpful in giving me confidence that I am on the right track and If I am not, that I can easily go back and try something else.

The first one is a thing of mindset and general software development approach. I think Gee Paw Hill phrases this most prominently and most fitting for me: “If you want more value faster, take Many More Much Smaller Steps” (see his blog post about it here). What he focuses on is the fact that the fastest way to the goal is not a straight line, but instead a path with many twists and stops, each of which is used to re-evaluate the current situation and then deciding on which step to make next. It is the understanding that zero uncertainty in software development is simply not possible and thus no plan can ever be a straight line, that brings up this message.

My learning is that I break my coding tasks into smaller steps in general and then when writing the code into even smaller, almost tiny steps. One of these tiny steps could simply be to make some tests pass, but leaving other open or implementing the functionality without actually integrating it into the rest of the application. On both cases, the other changes make up another tiny step. I combine this with an “aggressive” use of git commits in order to have a checkpoint to fall back to (see my other post about it: Commits are to Code what Save Games are to Video Games). The work in progress is significantly reduced, I don’t have to worry about a huge number of changes I made and thus can simply try something out here and there. If it does work, great: polish it a bit and then create another checkpoint. If it did not: No problem, simply drop all changes and start from the last checkpoint again. Finally, you can squash all connected commits to make one bigger commit, once you are happy with the solution and cleaned it up a bit. Martin Fowler calls this approach “compile-test-commit” in his refactoring book.

With this general working mode, I want to get more feedback on what I am doing in terms of the actual code, too: Am I going into the right direction? How is the code developing? Am I making a mess? What helps me here is Test-Driven Development (TDD) and especially the realisation that it helps me designing a better application or as Kent Beck puts it: “One of the ironies of TDD is that is isn’t a testing technique […] it’s a design technique, really a technique for structuring all the activities of development”. By writing a test first (and thus writing code, which won’t compile as all the references do not exist yet — after all, the test is first) you write down design you wished your code would provide. That is, you write down what code the test would call and how to check it and through that you automatically come up with the most simple design you can currently imagine. Why? Because that simple design makes your test simple and you don’t want to spend time setting up difficult tests.

This is also the first feedback I get: How much effort do I need to put into setting up my test? The higher the effort, the bigger the warning sign that your design is bloated and probably needs a refactoring. Once the implementation is added, I have some test to verify whether it satisfies what I initially planned. If it does, I have the chance to further refactor it (think about the reading and concluding step here). It’s also a great foundation to add tests for edge cases and error cases, which further increases your confidence that your code at least partially covers the major aspects.

In total, while writing code I want to have as few things to focus on at a time as possible, while getting feedback about those few things as quickly as possible. It’s basically like making tiny steps on stairs with handrails to give me guidance and confidence.

Conclusion

Reacting to changing requirements is crucial to deliver a valuable product to the client. We need code that can change just as swiftly and support us in reaching this common goal. I hope that this post helps to split the activity of writing code into meaningful steps, which then support us in considering their respective requirements through the full process.

While TDD is not a silver bullet (see an interesting discussion with David Heinemeier-Hanson, Martin Fowler and Kent Back from 2014), it can certainly help us design a better system with integrated feedback mechanism that tell us about problems as soon as they occur.

Clean code then builds on top of that and tries to put more meaning into the designed system by thinking about more fine-grained structuring of classes and components, their names, the APIs they expose and how they interact. Thinking from the perspective of others when creating code, is an absolute must and clean code can help to deliver.

Finally, we split things about into many small parts to reduce complexity. For this, we should leverage the tools at hand: let’s build well-scoped commits to keep the complexity of changes low, express more meaning with descriptive commit messages and keep our automated pipelines in the back of our minds then pushing code.

As a total, these things help me tremandously everyday when writing code, as I feel more confident, produce more stable software, allow others to handle the code more easily and thus enforce efficient teamwork.

Hint: This post was published differently before and got a major re-write. I learned a lot in the last few month about software quality, TDD, testing and many other topics, that I wanted to integrate here.

--

--

Gerrit Stapper
NEW IT Engineering

Software Engineer, Interested in Software Quality and Teamwork, Cyclist