The Law of Software Bugs

Generally, it is easier to make a mistake in program code than not to.

Hovik Melikyan
7 min readDec 17, 2015

This may or may not sound intuitive, but I think there is a semi-formal proof for it. And some interesting corollaries in the same spirit, too.

So why are mistakes harder to fix than to make in the first place, specifically in software engineering?

Let’s try to look at it from the point of view of entropy, chaos and order.

This is order:

And this is chaos:

One thing that we all know about chaos and order in general, though usually don’t bother to think of explanations, is that it is easier to create chaos than to bring order into the system. A dog can create mess in your bedroom pretty quickly, while putting stuff back in its places requires time and some intellectual effort. Or even a more bitter example: it may take years for a whole construction company to build something, while it takes one man with enough explosives to bring the structure down in an instant.

The reason why it’s harder to create than to destroy is that there is a greater number of possible chaotic states in a system than there are ordered ones. Of course it depends on the definitions and thresholds for order, but any reasonable definition will get you fewer possible ordered states compared to chaotic ones. Think of your socks in the bedroom: to create mess you can just throw them in arbitrary directions with arbitrary anger, without much thinking. A dog can do it. But sorting the pairs and arranging them in a predictable way so as to be easily found when needed, obviously requires a greater effort, plus some cognition and thinking, too.

So it’s all about the number of possible states (chaos) and a subset thereof that makes most sense (order).

What about mistakes in program code then? What are bugs anyway?

You have a bug when your code doesn’t work as intended. (Let’s save the topic of computability for another blog post, as it’s a complex one. A task may or may not have a solution in reasonable time, if it all, but by intention I mean something a priori doable. Let’s just simplify things for now: the program doesn’t work as intended.)

For example, you need a function that returns an integer average of two ints. At first, you will likely write it down as:

int avg(int a, int b)
{
return (a + b) / 2;
}

Right?

Wrong. It does not take into account a possible integer overflow, whereas an average of two ints should always be between a and b. In other words, the function avg() is expected to always return a (rounded down) result, and to never overflow.

Let’s now fix it. A novice programmer is likely to “fix” it the following way:

int avg(int a, int b)
{
return a / 2 + b / 2;
}

which, of course, is wrong again.

On one Internet forum, when asked to fix avg() in its original form, there were also suggestions to typecast the arguments to floating point and then round the result to int: again, a bad idea that can yield some strange results, or will be inefficient at best. An average of two ints computed using floating point, really?

Let’s think of a better way then. The next obvious fix could be this:

int avg(int a, int b)
{
return a + (b - a) / 2;
}

Right?

Nope! Try avg(INT_MIN, INT_MAX) and you will see how wrong it is. In fact if either a or b is negative, (b-a) can overflow relatively easily even before reaching the marginal INT_MIN and INT_MAX.

OK, the final solution? I would like to believe it’s this, unless I’m missing something, of course:

int avg(int a, int b)
{
if ((a < 0) != (b < 0))
return (a + b) / 2;
else
return a + (b - a) / 2;
}

First of all, one interesting thing that should be apparent from the above is that a software bug is usually a result of not considering, disregarding or forgetting something, or otherwise not knowing at all.

There is a certain number of objects with certain properties at play on each line of source code. In the case of avg(), for example, we have: a, b, 2, +, /, result. To spot the bug and fix it before it brings down your multibillion dollar cloud service or a space ship, you should look at all the objects involved in an expression/statement and think of everything you know about them. As in, a and b are ints and so they are doomed to always be within certain limits (too sad, but that’s life); the + operation is notorious for overflowing without warning in most languages; division tends to screw up ints and also hates division by zero, though thankfully we are fine with that at least, etc.

In real life it is too easy to forget, disregard or to be ignorant of any of the above. There exist only very few formally correct solutions for a given task, usually one of them being the shortest, but there are far too many possibilities to slip into a bug. If you need to keep 10 facts in mind while coding a solution, disregarding any one of them will result in a bug. And our brain is not ideal: it tends to forget things when it shouldn’t.

So, our quest to find the best solution for avg() should have already reminded you of the mischievous dog in your bedroom. It is too easy to make a mistake and it’s relatively harder to find the right solution.

To take a slightly more formal approach, let’s say you are given an a priori computable task and you need to decide how much effort to put into coding. There can be two edge cases here: zero effort, infinite effort.

With zero effort, provided that you have to do something anyway, you will just throw a random binary sequence at the machine and see if it works. This is pure chaos: the probability that a random sequence will solve a particular problem is negligible. Pretty much like throwing a bunch of balls on a pool table in the hope that they will form a pyramid. There are too many possibilities for the correct one to happen.

With infinite effort, or with enough effort to find the shortest and the sweetest solution, you will find it of course. This is order. It’s hard.

No, it is always harder.

As you move from zero effort in the positive direction, and as you take more and more things into account, there will be fewer and fewer possibilities, many of which will be wrong/buggy and only very few will be correct. Thus, moving towards the peak increases your chances of writing a correct solution.

So, to sum it up:

Finding a correct solution to a task involves keeping in mind and applying a certain number of facts. Ignoring or not knowing any of them will result in a bug. Introducing a bug is always easier compared to having and applying all relevant knowledge, because there is a greater number of states whereby some relevant pieces of knowledge are not applied, and fewer states with everything taken into account.

Point proven? I’d say, yes.

A few more thoughts. Firstly, fixing a software bug is not equal to finding the right solution: there is extra time required for identifying the offensive piece of code in the first place, and only then thinking of and finding a better solution.

But what makes it even worse is the fact that finding the correct solution takes a greater effort than coding the buggy one (The Law Of Software Bugs), and therefore the damage done to the project by a bug equals: time spent on coding the wrong solution, plus time spent on finding and identifying the mistake later, plus moral or financial damage suffered by the team/company. Thus:

Corollary 1. Fixing a bug requires a (often disproportionately) greater effort compared to not introducing it in the first place.

One cruicial and often overlooked side effect of bugs is degradation of the overall competence of new hires over time, which in turn causes further increase of bug rates, and so on. Once a product is past its stage of initial prototyping, usually done by a tight team of competent coders, comes a stage of relatively less interesting work and of course bug fixing. This is when a team starts slipping into the spiral of cost cutting and decline of quality. New hires tend to be less competent (because who else wants to do boring stuff and fix bugs?) and consequently introduce more bugs, etc. Over time, unless the management takes some extraordinary measures, which it never does, an average software team reaches its equilibrium when 50% of time is spent on bug fixing (as shown by this research, though it doesn’t seem to mention the initial prototyping stages when, anecdotally, less bugs are introduced), the more competent early employees/founders leave or otherwise quit coding. All this essentially being a result of letting the first bugs or bad initial design happen.

Corollary 2. A less competent team member can cause a disproportionately greater damage to the team’s work relative to that member’s lack of competence and reduced pay.

And finally, just like a physical system slips into chaos when left “unattended”, a software project slips into a mess that is your typical software sweatshop with a bloated team, unmaintainable and buggy codebase, financial losses, no joy at all. Chaos that’s easier to achieve than not to. Which brings us to:

Corollary 3. Only a conscious collective effort to minimize bugs at all stages of a product’s lifetime can warrant the enjoyment of coding and cut the development costs. Therefore, any programming “methodology” that at the very end can’t demonstrate reduced bug rates has no practical value whatsoever.

(Corollary 3.1. Pssst. If you can code alone, code alone. Just make sure no bugs or “boring” work sneaks into your project so that there’s no need to hire more people.)

--

--

Hovik Melikyan

Entrepreneur, software engineer, tinkerer, hobbyist musician