Coverage Is Nothing Without Assertions
How analyzing code coverage results unearths deceptive metrics
A few weeks ago I implemented my first UI test with XCTest for an iOS mobile app as part of a personal project. Apart from the implementation of the test being pretty easy I was impressed with the out-of-the-box coverage metrics that Apples framework provides. The simulation of a simple customer journey covered approximately 50% of the app’s code. At first, this gave me great satisfaction and again affirmed the assumption that a system or acceptance test does implicitly lead to great coverage. But it also got me thinking. Had I really verified that ~ 50% of the covered code works as expected or had I rather just used it without knowing whether it worked correctly?
Let’s look into an example. The app under test lets users add entries to a cinema diary, including the movie and venue. The venue can be chosen from suggested theaters nearby, based on the user’s location. Currently showing movies are also provided to easily add new entries. The previously mentioned test adds such an entry, so the methods providing the cinemas nearby and the movies currently shown are used and thus covered. But had I verified that the displayed venue and movie were actually correct? Was the venue the nearest based on the current location? Was the movie actually currently being shown in this area? With my test, I had simply implemented a typical customer journey and not yet included any kind of verification concerning the venue or the movie. But the code was executed and thus it was covered and the metrics display it accordingly.
Code coverage in regard to test levels
Let’s dig down a little deeper. When discussing code coverage, it’s is usually in regard to unit tests. A simple method for example, that returns whether a given number is odd or even can easily be thoroughly tested with a handful of tests, covering each positive case (odd, even) and some edge cases (zero, negative number, number higher than the maximum possible value based on type used). When executing these tests and analyzing the coverage it should be ~ 100%, whether path, line, state, etc.
Wait a minute. Did you notice something? The implemented unit tests are white-box tests, which are based on knowledge on the underlying code. So the coverage analysis is equally transparent and comprehensible. But does this hold for higher test levels, such as integration, system or acceptance tests? My answer to this question is “no”. Where an integration test is based on the defined interface, a system test is based on the functional description and an acceptance test is based on acceptance criteria, black box testing is applied. Thus, a direct correlation between test and code is neither transparent nor comprehensible. This explains my initial misdirected satisfaction. But it also tests (no pun intended) our common sense. What value do coverage metrics actually have on these higher test levels, if any at all?
Adding value to acceptance tests by using multiple assertions
Enhancing acceptance tests with multiple verifications adds value to the tests with no negative impact. It will not make them slower, nor will it make them more brittle. But keep in mind that the added value only holds if this is your only test level. Obviously, each separate functionality should be covered by a test (or multiple tests) on a lower level.
Nevertheless, by achieving a high code coverage through acceptance tests you can guarantee an undisturbed customer journey, assuming that the test would fail in the event of the app crashing. Thus, coverage metrics help ensure that the majority of functionalities available to the customer are actually covered in the acceptance test. By combining these tests with alerting on the logs any uncaught exceptions will be detected that might appear during the app’s regular usage.
Coverage metrics are useful on all levels of testing, although their explanatory power varies. While coverage metrics on unit test level give insight on gaps in the test coverage, coverage metrics on acceptance test level merely indicate the percentage of code “involved”. While a bug in the code, covered by a unit test, will lead that test to fail, it would not necessarily fail an acceptance test, covering the functionality provided by the code.
This said, when using coverage metrics, make sure you understand how they work and what insights you gain from them, before using them as a foundation for your project or implementing them as quality gates.