Testing vs writing tests
Testing is different from writing tests. Developers write tests as a a way to give them space to think and confidence for refactoring. Testing focuses on finding bugs. Both should be done.
We are right now at a point where developers do know that they need to write automated tests. After all, ideas such as Kent Beck’s Extreme Programming and Test-Driven Development, Michael Feather's on the synergy between testing and design, Steve Freeman's and Nat Pryce's on how to grow an Object-Oriented software guided by tests, DHH and Ruby on Rails on making sure that a web framework comes with a testing framework, etc, really sticked.
These ideas sticked with me as well. I have been trying to write as many automated tests as I can for the software systems I work since 2007 (I can tell the history of how traumatic my 2006 project was in a later post). I was so excited about Test-Driven Development that I even made it the topic of my MSc thesis. In 2012, I thought I had enough experience with the topic, so I decided to write a book about how I was practicing TDD, as well as a book on the basics of tools like JUnit and Selenium (unfortunately, only available in Brazilian Portuguese).
After many years of pair programming, I see developers using automated test code as a sort of support net. The tests enabled them to clearly think about what they want to implement, and supported them throughout the innumerous refactorings they apply.
In fact, research has shown quite a few times that doing TDD can improve your class design (see Janzen, Janzen and Saiedian, George and Williams, Langr, Dogsa and Batic, Munir, Moyaaed and Petersen, myself, among others). Recently, Fucci et al. even argued that the important part is to write tests (and it doesn't matter whether it's before or after). This is also the perception of practitioners, where I quote a recent blog post from Feathers: "That’s the magic, and it’s why unit testing works also. When you write unit tests, TDD-style or after your development, you scrutinize, you think, and often you prevent problems without even encountering a test failure."
Writing automated tests is therefore something to be recommended.
Pragmatically speaking, I feel that, when using test code as a way to strengthen confidence, good weather tests are often enough. And indeed, there's no need for a lot of knowledge or theories to test good weather, but knowing how to make testing possible (developers talk a lot about designing for testability) and to use the best tools available (and developers indeed master tools like JUnit, Mockito, and Selenium).
What I argue is that, after implementing the production code (with all the support from the good weather test code), the next step should then be about testing it properly. Software testing is about finding many bugs; it is about exploring how your system behaves not only in good weather, but also in exceptional and corner cases.
I recently asked my 1st year CS students on what kinds of mistakes they have done in their programs so far, which have taught them that the testing is important. I got a huge list:
- Programs that don't work when a loop doesn't iterate (the loop condition was never evaluated to true, and the program crashed).
- Programs that don't work because they missed some last iteration (the famous off-by-one error).
- Programs that don't work when inputs are invalid, such as null, or do not conform to what's expected (e.g., a string that represents the name of a file to open containing an extra space at the beginning).
- Programs that don't work because the value is either out of boundaries or in the boundary. For example, if you have an if that expects a number to be greater than 10, what happens if the number is smaller than 10? Or precisely 10?
- Dependencies that are not there, e.g., your program reads from a file, but the file may not be there.
- Among others.
Regardless of your experience, I am sure you also have faced some of these bugs before. But how often do developers actually explore and write such test cases?
I feel not that much. (If you are an empirical software engineering researcher, this is a good question to answer).
It is clear that, when it comes to software testing, researchers and practitioners talk about different topics. Very good pragmatic books, like Pragmatic Unit Testing in Java 8 with JUnit, have a strong focus on how to do test automation. While they provide good tips on what to test (the chapters on CORRECT and RIGHT-BICEP in the PragProg book is definitely interesting), they usually don't go beyond it.
On the other hand, software testing books from academia, my favorite being Software Testing Analysis from Young and Pezzè, have a strong focus on theories and techniques on how to design test cases that explore as much as one can from the program under test. However, with not so many timely examples on how to apply these ideas, which is definitely a requirement for practitioners.
Robert Binder in his 2000 book (p.45) makes a distinction between “fault-directed” versus” “conformance directed” testing:
- Fault directed: Intent to reveal faults through failures.
- Conformance directed: Intent to demonstrate conformance to required capabilities (e.g. Meets a user story, focus on good weather behavior).
I argue that a strong developer should test from both perspectives. They are complementary.
As a developer, if you want to take your testing to the next level, you should definitely get familiar with what researchers have been doing, as they have been conducting beautiful work on the field of software testing, which I believe are definitely applicable in practice. Some examples (but definitely not a complete list):
- Detecting whether some test suite is able to detect possible bugs. The idea of mutation testing has being applied at Google.
- Automatically generating tests (the famous example is EvoSuite, which is able to generate tests for Java classes) or to automatically reproduce crashes. I myself worked on generating tests for SQL queries automatically.
- Deriving models from your production system logs and build models that can be analysed by domain experts.
- Providing different code coverage criteria (you probably have some code coverage tool in your CI) and understand the benefits of each of them.
- Several tools for testing web applications, like automatically exploring the web app looking for crashes as well as finding crashes in REST APIs. More can be found in this web testing literature review.
- Empirically measuring how much developers test, whether test smells are important or whether they are the cause of flaky tests, why and how mocks are used in Java systems, and how developers review test code during code reviews.
- Property-based testing is also getting to practitioners by means of tools like ScalaCheck and QuickTheories.
- Among other things. And if you want to join this fight, read this summary from Antonia Bertolino on the current testing research challenges.
Writing tests is different from testing. They both provide developers with different outcomes. The former gives confidence and support throughout development. The latter makes sure the software will really work in all cases. While they are both important, it's about time for both communities to get aligned:
- Developers to get more familiar with more advanced software testing techniques (which researchers have been providing). They are definitely a good addition to their seatbelts and will help them to better test their softwares.
- Educators to teach both perspectives. My perception is that universities teach testing techniques in an abstract way (developers need to learn how to use the tools!), while practitioners only teach how to use the automated tools (developers need to learn how to properly test!).
- Researchers to better understand how practitioners have been evolving the software testing field from their perspective as well as to better share their results, as developers do not read papers. Research lingo is for researchers.
EDIT: Mariefred (from R/Programming) suggested this related link: Testing and Checking refined.
Acknowledgements: I thank Alberto Souza (Caelum, São Paulo), Anderson Leite (RG/A, San Francisco), and Arie van Deursen (Delft University of Technology) for their feedback in earlier versions of this post.