Short circuiting method executions to assess test quality
--
The growing adoption of DevOps has boosts automatic testing. Development teams spend significant effort and time to implement good quality test suites that check code correctness at every build. Yet, this effort is fully rewarded if test suites are actually good at capturing potential bugs. Test inputs might miss certain execution paths and test assertions might miss to observe certain buggy states.
One way to assess that test cases actually observe incorrect states is to introduce small changes, on purpose, and check that the test cases fail. If at least one test case covers the part where the change is introduced, but no test case fails, the test suite can be improved. This is known as mutation testing. An extreme form of mutation testing consists in completely removing the body of a method that is covered by one test case at least [1].
Short-circuiting method execution
Intuitively, extreme mutation testing consists in short-circuiting the execution of covered methods, one at a time, to determine if the the test suite is able to observe this kind of extreme change. A short circuited method that does not make any test case fail is called a pseudo-tested method.
In the following example, the execution of testAdd
will trigger the execution of method incrementVersion in class VList
. Yet, if we short-circuit the execution of incrementVersion
, by removing all the instructions of the method, testAdd
does not fail. In this case we say that method incrementVersion is pseudo-tested. This analysis reveals one weakness about testAdd
: while it covers method incrementVersion
, it actually does not assess its behavior (the modification of the version field).
Role of short-circuiting for test improvement
We have a build the Descartes tool that can automatically short-circuit covered methods and determine a list of pseudo-tested methods in Java projects. We have run this tool over 21 open source Java projects and we analyzed a total of 28K+ methods in these projects.
The most important results of this experiment are as follow
- short circuiting the complete execution of methods provides valuable feedback to developers. The developers have clear goal to write a test: to make this method not pseudo-tested anymore. Developers are more comfortable reasoning at the granularity of a method than at the statement level (fine grained traditional mutation testing).
- short circuiting methods has revealed the presence of pseudo-tested methods in all the projects that we have analyzed, even the ones with very high code coverage (cf. picture below). Development teams of all Java projects can benefit from this type of analysis to assess their test suites and improve them.
- interviews with developers reveal that some pseudo-tested methods actually reveal major weaknesses in the test suite. We have collected empirical evidence of test suites fixed after running a short-circuiting experiment.
If you want to know more about the intriguing nature of pseudo-test methods, we discuss several examples in our paper [2]. In particular, we analyze the reasons behind the presence of these methods and the challenges they pose to improve the test suites of real-world, large Java programs.
- [1] Will my tests tell me if I break this code? (Rainer Niedermayr, Elmar Juergens, Stefan Wagner.Proceedings of the International Workshop on Continuous Software Evolution and Delivery. 2016.
- [2] A Comprehensive Study of Pseudo-tested Methods (Oscar Vera-Perez, Benjamin Danglot, Martin Monperrus, Benoit Baudry), In Empirical Software Engineering, 2018.
- Descartes tool for short-circuiting method executions: https://github.com/STAMP-project/pitest-descartes