Why does testing not find bugs?

Even after extensive testing, it turns out that your software is still riddled with bugs. How does that happen?

Since this subject comes up all the time, I figured I should write some kind of repeatable point of reference on it (including semi-fancy graphics).

Typical Example of Bug Spread

In this image you see my example of a typical bug spread, as a factor of time. As you probably imagine, the ‘E’s in that image are errors. The horizontal scale is an arbitrary amount of time, but let’s think of it in terms of hours.

It doesn't matter how long you test, once you've got all the obvious bugs that have to do with the functioning of your code, the bugs displayed above will still remain and are caused by all kinds of weird edge cases.

It’s possible your software is so simple that edge cases do not exist -in which case, good for you- but any sufficiently complex piece of software will have to deal with them.

Now, after a day of testing, your diagram might look like this.

No bugs found, so all is good right?

Here’s the diagram after we test it out on our first customer.

Still no bugs, so this should be safe to roll out to production right?

Suddenly all these day bars begin being checked all at the same time (one per person using the product). In 10 hours 6 unknown bugs appear.

Holy hell?! Where did all these errors come from. Didn't you test this? We even did the test with the real customer, and everything went fine! How do you explain we have 6 errors in the first day?

Basically, that’s just the nature of errors caused by edge cases. We could have tested for 29 days and might have found them all, but that’s hardly the most productive use of time.

The nature of these bugs is also so that you generally do not think of the conditions that cause them, especially if you are the one that wrote the code. So you might not even follow any pathways that detect them.

This is why good software companies have dedicated testers. Those might actually stand some chance of finding all these.

The rest of us will just have to deal with the fact that we can’t fix everything before release, and be happy that the scope of these issues is generally very limited.

P.S.: Well designed software may decrease the likelihood of edge bugs. Unfortunately, most software is not well designed.

Tests may also decrease the likelihood, but are generally designed to test the expected flow, not edge cases (regression tests will help, but only after finding the issues in the first place).

The diagrams are a simplification, as there are many ways you could do the testing, and it doesn't account for differences specific to the one testing, but they work as a general indication.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.