Test until Fear turns to Boredom
As software engineers, we are always deciding whether the testing we have done is sufficient to release what we are developing. Some teams require more explicit sign-offs before reaching this level, like a certain level of test coverage, or a certain number of code review approvals; whereas,n others leave the test resiliency entirely up to the developer. Still, the decision on whether our work is sufficiently tested is often left up to us to some extent. We ask ourselves in these cases, “How much testing is enough?”
Kent Beck is widely credited with the invention of “Test Driven Development,” a process that requires developers to write automated tests that will fail until the task they are setting out to accomplish has been completed. When asked for advice on when testing is considered sufficient, he declared “Test until fear turns to boredom.”
So how can you evaluate when you have reached that point? There are several useful benchmarks. Let’s consider the different levels of testing. In general, we have unit testing (the lowest level, generally tested at build time), integration testing (broadly, tests of the combination of several units, sometimes automated), and manual testing (often performed or validated by a quality engineer).
So how many tests should be written?
The short answer is, “It depends.” This entertaining story used as an answer to a StackOverflow.com question tells the story of different responses that could be provided to this question depending on the skill level of the developer who is inquiring. The most experienced programmer is asked in response “How many grains of rice should I put in [a boiling pot]?” The programmer replies “How can I possibly tell you? It depends on how many people you need to feed, how hungry they are, what other food you are serving, how much rice you have available, and so on.” And so it is for most features. A public facing API will require near 100% unit test coverage, whereas a smaller feature may require less. Only you will know how many tests are sufficient for what you are writing, and enumerating them becomes easier with time.
At the integration/automation test level, you will generally want enough tests to validate each potential user flow — as in one for the happy path and one for each of the major error pathways as well. Not all integration tests can be automated, but if yours can you will at least want to automate testing of your whole flow end to end. As far as testing external systems, general best practice is to only test what is within your control. If your tests might be blocked by external dependencies, it might be good to gate them behind a ping test (so they don’t run unless the external dependency is up).
At the manual testing level, it is useful to consider the different types of possible failures. You’ll be doing an implicit calculation of the trade-off between the boredom you will experience as you steadfastly enumerate all the required types of failures and write tests to suit each one, and the fear you will experience if you’re just not quite sure the changes to the code base won’t break things. Try to find a reasonable balance between the two. This can be recalibrated based on how often you find yourself failing. If you’re spending too much time on this stage and never experiencing any problems, you might be focusing too much on writing tests. On the other hand, if you’re playing it fast and loose you will likely become victim to patch deployments, snarky comments from your QA, and lots of bugs filed against your tickets. Time to up your tests!
Some teams add an additional layer of acceptance testing to validate that the product is “acceptable” for end users or that it generally meets the decided specifications. Acceptance tests are generally run by having a small group of users (or potentially allowing an internal release for employees) test the product end to end and report any issues.
In conclusion, pay attention to your gut fear as a signal that more tests are necessary. Consider reevaluating your team’s approach to testing if you are constantly fighting fires from features that were not well validated. And may your tests be sufficient enough to eliminate your fear and comprehensive enough to not bore you to death!
Follow me on twitter @amandasopkin.