Mutation Testing For Move: Improve Your Unit Tests
By Karlo Mardešić (Eiger), Vineeth Kashyap (Aptos Labs)
Abstract
Exhaustive testing is crucial for contracts that handle valuable assets. But how do you know when your tests are good enough? The Move Mutation Tester is a novel tool in the Move toolset which automatically finds blind spots in a test suite.
Introduction
Having good unit tests is important for ensuring the correctness of software and showing the intended behavior. Indeed, they are one of the expected practices in modern software engineering. They are also needed for evolving software with a high degree of confidence: that changing code (for adding a new behavior or fixing a bug) does not break the intended behavior. In the realm of smart contracts, good unit tests are even more important, given the dire financial consequences of buggy code.
Measuring code coverage is the primary method to evaluate the quality of unit tests. Aptos CLI has a coverage tool, which we recommend as a best practice. However, code coverage only measures whether or not a line of code is covered by the unit tests, but it does not measure whether the unit tests are performing the right semantic checks on the code being tested. In a previous blog, we showed why code coverage is insufficient for measuring the quality of unit tests.
Mutation testing is a technique to find blind spots in unit tests and goes much further than coverage tools. It injects faults in source code, and checks if unit tests can detect those faults. We now have the first release of the move-mutation-test tool (available here, developed by our collaborators at Eiger). It evaluates the quality of Move unit tests, and we would love for you to try it out on your Move projects and give us feedback.
How does the tool work?
The tool works in two steps:
- It mutates the original source code in various ways. Each mutated version is called a mutant.
- The tool then runs the (unmodified) unit tests for each mutant. We would expect that mutants cause unit test failures. Two outcomes are possible:
- Mutants fail the tests. These mutants are called killed mutants. They show that the tests catch injected faults.
- The tests succeed for the mutant (called alive mutant). This points to a potential blind spot in tests.
If the mutation test tool report is filled with a lot of alive mutants, then it is worth looking at improving the unit tests.
Example walkthrough of the tool usage
You can install the tool using aptos update move-mutation-test
(make sure to have the latest aptos CLI, at least version 7.0.0). Alternatively, you can install from source.
Let’s perform a small exercise within a real-world project to show how the tool could be used. We will use the aptos-stdlib project.
The tool has two main subcommands:
move-mutation-test run
(runs the tool and generates a report),move-mutation-test display-report
(nicely displays the results)
The tool can generate a large number of mutants for the entire project, and running tests on each mutant can be relatively slow. The recommended way to use it is by being more directed and specific, i.e., using it on a per-module or per-function basis.
Let’s select the fixed_point64
module to generate the mutants. Let’s avoid generating mutants that would definitely survive due to a lack of unit test coverage, by using the --coverage
flag to ensure mutated code is generated only on pieces of the code that have proper unit test coverage.
Note: To use the --coverage
flag, the user first needs to run the aptos move test --coverage
command to generate the unit test coverage report stored locally within the project files. Below, we assume that this command has already been run.
move-mutation-test run --coverage --output report.txt --mutate-modules fixed_point64
Once this command executes, we should see a short summary that tells us the number of alive mutants per function in the module.
From the above report, we can observe that the function round has nine surviving mutants. Let’s use the following command to inspect the results with more details:
move-mutation-test display-report coverage --path-to-report report.txt
If we scroll further down, we find the round function where we can see mutation testing coverage: the lines with information regarding the killed/total mutants:
Note that ideal scenario is to have all/most mutants killed. This is a simple overview of the mutation testing coverage of the function, but it doesn’t tell us which mutants survived. For that purpose, we can use the mutants subcommand:
move-mutation-test display-report mutants --modules fixed_point64 --functions round
As a next step, let’s see how we can improve the unit tests so that these mutants would fail the tests.
From the above, we can see that the tests for the round function could be improved further. Let’s try to enhance these tests with the below:
Now, we can rerun the tool. Let’s be more specific this time to make the execution shorter, and mutate only the round function with the command:
move-mutation-test run --coverage --output report.txt --mutate-modules fixed_point64 --mutate-functions round
We can already see from the summary report that the stats for this function have improved!
Let’s again check the coverage with the display-report coverage command:
move-mutation-test display-report coverage
Now, the coverage for mutation testing has improved (more mutants are killed), improving the quality of our tests.
The above output resembles the unit test coverage report. It shows per-line statistics for the number of killed mutants per total number of mutants.
Conclusion
Mutation testing complements the code coverage tool to gain insights into the quality of the unit tests. It can find blind spots in unit tests, thereby also potentially identifying bugs in source code. We recommend trying it out, especially for high-assurance Move contracts.
Note that, similar to how trying to achieve 100% code coverage everywhere may not be worthwhile (and sometimes even detrimental because it can cause the unit tests to be too brittle), achieving 100% mutation testing coverage may also lead you astray. The results of mutation testing should be used as guidance, and project-specific judgement should be used for identifying which parts of the codebase should receive more attention for improved testing.