How to create a culture of continuously refactoring code?

Published in

Melio’s R&D blog

6 min readFeb 14, 2023

Intro

Just like anything else, code rots over time. In this post, you will learn why it happens and a possible solution driven by a mechanism to continuously refactor code.

I am basing my suggestion on my own experience and the battle-tested methods I have been using constantly throughout my career.

How come code rots?

In biology, half life is defined as the time required for a quantity to reduce to half of its initial value. This can track the decay of matter.

Code, like biological matter, also rots and decays at some velocity over time.

These are the main 3 causes of code rot:

My coding abilities improve over time
Do you know that feeling when you come across nasty code from a few years ago and then you realized you coded it?
The code I write is representing the outcome of my engineering work at that point in time. It’s as fresh as the date of the commit. But my coding skills keep evolving over time, so by definition, the minute after my commit, the code has already begun to decay.
Business needs and scale fluctuates
Business shifts and my assumptions lose grip of reality. The most common example of that is the increase of scale on the system my code is supporting. I might code a solution to a problem given the assumption of X RPM, but the code will not be able to support 10X RPM.
Deliberate tech debt
Timeline considerations that might drive you to compromise on code quality from the beginning. This is a deliberate decision you make during coding. It’s not code rot but the side effect is the same, meaning your code is not optimized to the software engineering guidelines you strive for.

Pains of code rot

Here’s a list of the main pains you might be feeling as a coder:

Tests that are not refactored are becoming a liability and no longer serve their purpose. Just like production code, if not refactored, they become a burden instead of a benefit.
I am doing copy-paste programming which makes it hard to refactor and that generates code duplication.
I identified places I want to refactor, but I don’t manage the tech backlog or plan it.
The big ball of mud effect starts kicking in and when I need to add more code to the point of refactoring, the problem is getting worse.
In some cases, test coverage is not good enough so even a small refactoring becomes a risky change.
Most cases of medium to big refactoring points don’t happen since I don’t have the time to do it.

But why bother with refactoring? What’s the end goal?

Look at mature software codebases, for example, the Linux kernel. After years of refactoring and improvements, the refactoring frequency reduces and overall stability is sustained. That’s a mature point in software that for SAAS developers we rarely reach in SAAS systems.

Only a few of you out there are currently working on the next Linux-kernel like codebase which will reach software maturity, but that target, like a lighthouse, can guide us.

Terminology: Day-to-day refactoring vs strategic refactoring

Refactoring is a controlled technique for improving the design of an existing code base.
All refactoring should be regression free and safe to deploy without impacting the outside interaction with the code.

In some cases, the target design requires some changes to the usage of the code or its side effects. Keep in mind that it would increase the effort of refactoring and try to be deliberate in making such a decision.

Examples of day-to-day refactoring:

Code extractions to reusable code or libraries
Renaming functions or parameters
Improving tests

Examples of strategic refactoring:

Breaking down a monolithic service into smaller microservices
Migrating to another database or model which is more aligned with the domain and scale
Migrating a service to another stack or technology to better address functional or non-functional requirements
Extending the system design to prevent or recover error flows

Refactoring is motivated by for example:

Increasing dev velocity
Increasing system stability

Day-to-day refactoring mechanism

We should encourage the team to favor day-to-day refactoring over strategic refactoring.

When estimating a task, given the team identifies a day-to-day refactoring opportunity, we should factor it into the estimate and DOD definition (2-day refactoring + 4-day implementation of requirements).

The boy scout rule, Always leave the camp cleaner than you found.
During the development, the team identifies an opportunity to refactor and if the estimate can fit the refactoring too (or to delay by a factor of 20% or an additional 2–3 days) then we should favor doing the refactoring.

During the development, the team identifies an opportunity to refactor that can’t fit into the task estimation. Then we should consider it as planned refactoring.

Planned refactoring target to execute the refactoring backlog

Anyone on the team can create a refactoring task and add it to the backlog.

The team should have a refactoring backlog (maybe together with the technical backlog).

In what capacity will we be doing these planned refactorings tasks? It depends if the task is a small refactoring or strategic refactoring:

Small refactoring tasks:

Given we come to develop in the area of the small refactoring task, then we should add it to the definition of done and do both the refactoring and the task. (GO TO “When estimating a task” above).
A refactoring task without any future development should not be implemented unless it creates a quality point of failure — our motivation is to avoid a refactor that does not drive an impact — We should avoid refactoring a stable code that rarely fails / does not require maintenance.

Strategic refactoring tasks:

Strategic refactoring which is prioritized by the team as a technical backlog can pick up the refactoring tasks.
Product backlog requires refactoring when the current architecture can’t support new requirements and refactoring becomes a prerequisite.
Strategic refactoring can be done if it is needed in order to better implement a production incident post-mortem AI.

Key takeaways

We need to constantly groom and refactor our code. Achieving a good balance of software maturity and fault tolerance requires constant adjustments.
Refactoring might require a small amount or a lot of effort. This post describes a mechanism to balance the refactoring work alongside other tasks targeting business requirements.
Not all code should be refactored, we should optimize the code refactoring for quality and velocity. So if the code is stable and fails rarely while not being maintained often then the impact of its refactoring will be low. Let it decay in peace.