Who Will Test The Tests?

TL;DR — Mutation testing can offer an extra level of insight into code covered by tests by spotting gaps in your assertions.

The example app shown in this post is available on github.

Confidence

Automated tests can bring many benefits, one of which is increased confidence that a system does what it should. But once this confidence is gained, you may find yourself troubled by this question:

Do I have confidence in my tests?

Or more specifically, do my tests include the right set of scenarios to cover the various code paths within my system? Is it possible to test my tests?

Test Coverage

At this point people will often recommend using test coverage tools. These tools record which parts of a system are executed whilst your tests are run. The idea is to highlight areas of your code which are without tests, where you can then consider adding some.

Whilst test coverage is a useful technique, it’s not a panacea. One trap that’s easy to fall into is thinking that covered code is tested code. That last sentence may sound unintuitive so here’s an example.

An Example

Say we’re making an online form that allows people to sign up for an account with a website. We need to validate that the password the user provides matches some criteria. For simplicity’s sake, let’s say it needs to be between 1 and 10 characters (this is very silly, of course, it’s just to keep the example simple). Here’s an implementation in JavaScript:

validate: function(password) {

if (password.length > 0 && password.length < 11) {
return true;
}

return false;
}

And here’s the tests, written using Jasmine:

describe(‘Password validation’, () => {
  it(‘An empty string is not a valid password’, function() {
assert.equal(
passwordValidation.validate(“”),
false);
});

it(‘A non-empty string is a valid password’, () => {
assert.equal(
passwordValidation.validate(“A”),
true);
});
});

Some of you may have already noticed the deliberate mistake — we have a missing test case for where the password is too long. Now you could say that by always doing Test Driven Development, using Red/Green/Refactor and so on, that you should never end up in this situation. And you’d have a point. But even with the best of intentions, people mess up from time to time.

But that’s ok, because a test coverage tool will tell us about things like this. Right? Let’s run the code through a test coverage tool (I used Istanbul).

------------------------|---------|----------|---------|---------|
File | % Stmts | % Branch | % Funcs | % Lines |
------------------------|---------|----------|---------|---------|
All files | 100 | 100 | 100 | 100 |
password-validation.js | 100 | 100 | 100 | 100 |
------------------------|---------|----------|---------|---------|

Totally covered. So that’ll be a “nope” then! Or at least, not in every case.

But how did test coverage not pick up on this? Let’s look at the code again, specifically this if statement which consists of two expressions:

if (password.length > 0 && password.length < 11) {
  //...
} 

password.length > 0 is the first expression and password.length < 11 is the second.

The test coverage tool will consider the whole if statement to be covered if both expressions are run. But we can make this happen in one test case, simply by having a password whose length is greater than 0. In this situation, because of the &&, the second expression will have to be run, and voila, the whole if statement is covered.

And because we have such a test case:

assert.equal(
passwordValidation.validate(“A”),
true);

…then the whole if statement is indeed covered.

Following this logic, it becomes apparent that to get 100% test coverage, you just need to exercise the code. No need for any assertions. Whilst it’d be pretty obvious if you had no assertions at all, there is a range of mistakes you can make writing assertions that won’t be picked up by test coverage, such as:

  • The wrong value is used for the “expected” or “actual” parameter.
  • An assertion is temporarily commented out, but never put back in.
  • The logic in the assertion is wrong (more likely if it’s complicated).

Test coverage doesn’t tell you anything about your assertions, just which code was executed by the tests.

What Now?

To be clear, I’m not saying that test coverage tools are flawed, far from it. I’m saying that it’s important to know what they designed to do and what their limitations are.

But that said, what’s a conscientious test author to do? Well, you could apply this principle:

If I modify the behaviour of the system, it should make a test fail.

For example, say I went into the code and deliberately tried to introduce a bug (I could invert an if statement, or maybe replace a ‘-’ with a ‘+’, etc.). Then when I come to run the tests, I would hope to see at least one failure. If all my tests pass, it means the behaviour I changed isn’t tested.

Now that’s fine in a toy example like I mentioned earlier, but it would soon get tedious for a code-base even slightly larger. Wouldn’t it be nice if there was a way of automating that process? Well, handily enough, there is, and it’s called Mutation Testing.

Mutation Testing

A mutation testing tool will take the portion of your code-base that’s covered according to a test coverage tool. It will then methodically change (“mutate”) each part of this code, running the relevant tests afterwards each time.

In each of these test runs, if a change (“mutation”) is not detected by the tests, and they all pass, then the mutant is said to have survived.

If we run our previous example through a mutation testing tool (for this example I used Stryker), our missing test case is exposed because a mutant survived. When the tool changed a “<” to a “<=”, the tests still passed:

Mutant survived!
/Users/oliwennell/mutation-testing-example/src/password-validation.js: line 5:35
Mutator: BinaryOperator
- if (password.length > 0 && password.length < 11) {
+ if (password.length > 0 && password.length <= 11) {

And the tool gives a nice summary:

9 total mutants.
1 mutants survived.
0 mutants timed out.
8 mutants killed.

So that’s very useful. But in the same way as 100% test coverage isn’t always practical, or even desirable, you may not always be able to, or want to, address each surviving mutant. The key is that mutation testing makes these gaps visible where they may have been hidden before. It’s up to you to use this information in a way you see fit.

Your Turn

Testing is a very broad subject, and there can still be bugs in a system despite no mutants surviving. Despite this, mutation testing does add an extra level of confidence on top of code coverage, which I think makes it worth trying out at least. If you can find a tool that supports your ecosystem, language etc. give it a go, I’d love to hear how you get on.