How To Improve the Quality of Tests Using Mutation Testing

Use PITest in Kotlin project for mutation testing

Published in

Better Programming

7 min readNov 17, 2022

In today’s article, I want to introduce mutation testing and show how this can help improve the quality of tests in your Kotlin project.

The normal workflow as a developer is to write tests, which should verify that the productive code is working as expected. For this, I write different kinds of tests according to the test pyramid (you can find one in the articles of Martin Fowler ), which test different aspects of the application under test.

There are a lot of unit tests verifying the internal behavior of single classes, the order of following the sociable approach, and the behavior of related classes, either mocking all dependent classes or better using test doubles. Fewer component tests verify the interaction between different components. Last but not least, there are a few tests that test the integration of the code with external components like a database or Rest API. Sometimes, there is an additional layer of tests verifying that the use cases relevant to the application user are working as expected. These are called acceptance tests.

This is a very rough and short overview of how the different tests work together to guarantee that the productive code works correctly for the current state.

But you all know that the productive code is updated, e.g., for adding features, and the existing behavior is no longer working as expected. The application user recognizes that faulty behavior, but none of the automated tests have detected the bug before delivery. This raises the question of how to check the quality of the tests regarding detecting behavior changes in productive code.

There is still the misunderstanding that having a high code coverage through automated tests is equal to a high quality of tests. I will show later in an example how a test coverage of 100% still does not recognize code changes, which changes the behavior.

Code coverage is just a tool that can help to identify parts of code, which are not covered by tests yet, but not how well the existing tests are written. This is especially true for tests using mocking frameworks (like Mockito or Mockk).

So, what other option is available for making a statement about the test quality?

Mutation Testing

For this task, mutation testing is coming to the game. Mutation testing is a type of software testing which is changing (mutating) the productive code and checks if any of the existing tests are detecting this change. The goal of mutation testing is to ensure that the automated tests are written so that changes in the application's behavior are detected by at least one of them. It’s about checking the robustness and quality of tests.

Let’s give an overview of how the mutation analysis is done step by step:

For every part of the source code, which is analysed a lot of mutants are created. A mutant is a version of the original code with just one single change (e.g., invert an if condition or increase the counter of a loop by one).
All the tests are run against the original and mutated codes to check for the result.
The results are compared to each other. If the tests running against the mutated code lead to failures, they are good enough to detect code changes. If the test results are equal, the tests are not detecting the source code change and need improvement.

For the creation of mutants, there are different types possible:

Decision mutation
Value mutation
Statement mutation

Decision Mutation

Control statements like if, when or for are changed.

// before:
if (a >= 5) {
    //...
} else{
    //...
}
// after:
if (a <  5) {
    // ...
} else{ 
    // ...
}

Value Mutation

Values of variables are changed.

//before:
val i = 1002
doSomething(i)

//after:
val i = -120
doSomething(i)

Statement Mutation

A statement is removed or replaced by another statement.

// before:
val list = listOf(1,2,3)
doSomething(list)
doSomethingElse(list)

// after:
val list = listOf(1,2,3)
doSomethingElse(list)

Now that the basics of mutation testing are explained, let’s check how mutation testing can be added to an existing Kotlin application.

For Kotlin, you can use the PIT test, a mutation test system initially developed for Java but also for Kotlin when running on JVM.

So, let’s start adding the PIT test to a Kotlin application using Gradle.

Extended documentation about the configuration possibilities can be found in the documentation of the Gradle plugin (gradle-pitest-plugin).

With this configuration, it is possible to execute the mutation tests with the following:

gradle pitest

The result is not very spectacular because no tests and source code are available.

Now that the setup of PIT test is finished, let’s create a sample productive class and existing tests for that class.

The sample productive code looks like the below:

The initial tests are shown below:

When running the existing tests with coverage in IntelliJ, I can see that the coverage is 100%. The tests should seem good enough to detect changes in productive code.

To check if this statement is true, let’s change the existing code and run the tests again to see if a test detects that change in productive code behavior. For simplicity, I return an empty list instead of the input list. Here’s what that looks like:

fun filterList(list: List<String>, filter: String): List<String> {
    if (filter.isBlank()) {
        return emptyList()
    }
    val result = mutableListOf<String>()
    for(element in list) {
        if (element.startsWith(filter)) {
            result.add(element)
        }
    }
    return result.toList()
}

As seen below, the tests still pass, and as a developer, I don’t get a hint that the functionality changed (maybe done by mistake).

Let’s see what the result of running the mutation test is for the current tests:

Four mutations were created, but the existing tests did not kill one. I can also see in the test report which mutators are used. Because I have nothing configured, the default ones are used. It’s possible to configure the used mutators (see PITest).

To verify the behavior of the mutation tests in the next step, I will optimize my test to better verify the productive method's behavior.

Running the mutation tests again, the report is shown as expected.

To be fair, I just optimized the tests to work with the default mutators. If I update the mutation test configuration to use all available mutators, the result looks different.

pitest {
    setProperty("junit5PluginVersion", "0.15")
    setProperty("testPlugin", "junit5")
    setProperty("targetClasses", listOf("com.poisonedyouth.*"))
    setProperty("outputFormats", listOf("HTML"))
    setProperty("failWhenNoMutations", false)
    setProperty("mutators", listOf("ALL"))
}

A lot more mutations are created, and I have to look at my tests again.

Summary

As you can see in this simple example, mutation testing can help me to check how well my tests are written so that changes in productive code (which also can happen by mistake) are detected reliably.

But you should remember that mutation testing is a tool that can help get hints about the code. It’s no universal solution (the same as code coverage) whose results are the only truth.

It is unnecessary to run the mutation tests with every commit on the CI/CD pipeline because the runtime increases very fast with every new productive code written. So it can help to execute the mutation tests time by time for parts of the project (e.g., refactored or newly implemented) to get a hint about the quality of tests.

As an example, the ktor-encryption-server project has the following size (current version)