The art of the bodge or how to incur tech debt

We’re all familiar with strategic solutions, also known as “temporary fix” or “technical debt” or “it’ll do for now”. These are the things that we do that we’re not proud of, we know it’s not quite right, or that there’s probably something better, but it “solves” the immediate issue, “gets the job done” and allows us to carry on with other work.

Our feelings when we implement these bodges range from an inkling that there’s probably a better way of doing it, to crossing your fingers and hoping that nobody notices before you get a proper solution in place.

I’ve been guilty of this on plenty of occasions, I’ve hardcoded variables, I’ve cobbled together scripts which include bash, python and perl in the same file, I’ve manually edited startup files, I’ve written readme’s which include instructions like “run this script twice” so I’m not judging anyone any more than I judge myself

I first got the idea that this attitude could be a problem when I was auditing the code that held together the email service for a large local ISP. I saw two lines that alarmed me. I don’t remember the language, but the first was essentially

IF 1 = 1 THEN

This was distinctly puzzling, why would anyone include a condition that was always true? Perhaps there had been a failure case that was no longer relevant, perhaps someone was still working on it.

Then just below it I saw the properly alarming line. It said “Temporary fix” with a date. This “temporary fix” had been in place for almost 6 years.

“Oh ffs” I thought and set about removing the test and just going straight to the function. I reloaded the service and of course, it didn’t start.

I spent several days over the next couple of weeks, trying to figure out why this code depended on a test that was always true. It was as if the test itself was more important than what it did as a result of that test. In the end I just added another line below “temporary fix”

“Don’t bother, I’ve already tried” and a new date.

The basic problem with temporary fixes and strategic solutions is that they are fixes and solutions. In order to make things better or to implement the proper solution is that you need to make an effort. Best case scenario is that you simply need to divert time away from something else to implement the new fix, but just as often you need to break the existing system to bring in the new one and invest further time to restest and verify.

The giraffes Laryngeal nerve is a great example of evolutionary tech debt. It’s what happens when you start with one design in our fishy ancestors and gradually iterate until you end up with a giraffe. Then you notice that you’ve got this nerve that travels all the way up the neck and back down to bridge a gap between organs that are a few inches apart. It’s a terrible terrible setup, but it’s utterly impossible for evolution to fix it.

The longer you leave a temporary fix in place, the harder it becomes to replace it with something else and the reason to do this becomes more and more obscure.

“Well, it’s been running that way for six months, so it must be fine”

No, if it was crap when it went in, it’s still crap, it’s just six month old crap now.

“Yeah, but it works”

Yes, but it could work more efficiently, more reliable, better, stronger, faster. If it’s done properly then it’ll be easier to upgrade and modify.

Bodges can make things difficult for people that come along later, first you need know that there’s a bodge there. If you didn’t follow usual procedure, then the bodge may be somewhere that a later engineer wouldn’t spot (like a manually edited config file). Once the bodge has been spotted, it needs to be deciphered so that the engineer can understand what’s going here. Then you need to go through the process of determining why it was bodged rather than done properly, was it simply haste? laziness? ignorance? Or was it done deliberately because of unknown factor Z?

All this is extra work that will be incurred anytime someone goes near it. It may come to the point where people are afraid to go near your lovecraftian nightmare spaghetti code and the module lingers and fossilises until none can remember it’s true purpose, occasionally a brave young acolyte will feed it a variable and pray that an appropriate exit code is presented at the other end.

So how do we combat this?

Up-front tech debt prevention steps can include such things as prohibiting ssh access to your servers (so everything has to be done via config management and manual fixes are impossible) using something like jenkins job builder on your CI server (blocking config changes via the gui and again enforcing config management). Open sourcing your code by putting it in a public git repo forces you to think about security up front and requires that things like hostnames, IPs and passwords are sequestered away somewhere as variables and aren’t hard-coded anywhere.

These things are, undoubtably, a pain in the arse, but after a short time of using them, you get into the habit of it and time invested early saves you having to come back later and try to unravel your mess.

There are two types of technical debt,there’s the kind that’s planned in advance, because it’s the quicker or cheaper route to functionality, or because it’s a viable workaround until a particular component or resource is available and there’s surprise, hasty tech debt, Sometimes due to a surprise vulnerability with a clickbait name, sometimes in the wee hours of the night a bleary eyed engineer does the only thing they can think to make the alarms stop, sometimes a release date gets brought forward, sometimes somebody further up the chain decrees you need to change your entire network configuration a month before go-live.

With the sort that’s known in advance, make it clear that you’re saving X points now but it’ll cost X points later. Get buy in to the idea that this will come back around again and stay on it.

View this sort of tech debt the same way as financial debt. With large projects it’s almost inevitable. It’s not necessarily a bad thing as long as you have a sound plan to manage it and actually put that plan into action. If you just leave it, like financial debt, it will grow and grow, possibly even putting the project at risk.

It’s not necessarily a bad thing as long as you have a sound plan to manage it and actually put that plan into action

In theory, it’s possible to avoid it entirely in the same way it’s possible to avoid financial debt. Burn your credit cards, pay for everything in cash, don’t buy anything you can’t afford. Using this approach can be massively inconvenient and can lead to missing deadlines (in the case of tech debt) or spending a week eating nothing but beans (in the case of financial debt)

The main thing that’s needed to correct this problem is understanding. Understanding from developers and engineers that this needs fixed and understanding from the clients that the reason they’re not getting shiny new feature X this week is because you’re busy sorting out the mess that was created in order to get a product shipped for their previous arbitrary deadline.

With the hasty sort, you just have to clean it up as quickly as possible afterwards, look on it as if someone has been sick on your sofa. Technically you could just leave it there and carry on watching eastenders, but would you?

This sort of push-to-production bodge or the 3am fix needs to be addressed as soon as the flames die down. Preferably by the engineer who implemented it paired with another with fresh eyes who is able to see the issue as just another problem to be solved and not a screaming pile of flaming roadblock that kept you from your bed for 5 hours. Often when you’re forced to implement something like this, then all you can see is the gaffer tape you slapped all over the place and not the proper way of doing it.

IT happens

Tech-debt and strategic solutions happen, and when it does happen, whatever the reason, you need to track it, and track it somewhere visible. Whatever system you’re using, be it trello, jira or post-its on the wall, stick it up there as soon as it happens.

Track it like Steve Irwin

Which brings us to the only real solution to tech debt and strategic solutionising, you’ve got to own it, wear it like a brand, carry it like an albatross around your neck until you are rid of it. At every opportunity, every planning meeting, bring it up. Crawl on your knees and present your tickets of shame, say “I have committed an atrocity in code, please may I humbly request a story point to clear my name of this chagrin so that I may once again hold my head high and look myself in the mirror. Please, I beg of you, grant me this boon so that my children do not live with the same dishonour as I”
Eventually, you’ll get the points, the time or the help you need to fix it properly or you’ll get told that it’s not going to get fixed, in which case get that person to comment and close the appropriate ticket

// placeholder conclusion, proper conclusion to come in v2 of this article

One more note about strategic solutions. Several years ago, there was a newly independent country, the two houses of their parliament debated what to call their new leader. One side wanted a grand title like “king” or “lord high protector” the other side didn’t wanted him getting delusions of grandeur and wanted to give him a ridiculous humble title “like you might give the head of a cricket club” This was important, instead of figuring out how to run the country, they spent 3 weeks debating this title.
Eventually, the side that wanted “king” compromised, they said, “Ok, we’ll use your silly title for now, just so we can get on with business. But we want it made absolutely clear that we don’t agree with this and this is a temporary measure” 
To this day, the senate of the united states of america, has still not officially approved the title “president”