My Take on Property-Based Testing
A few months ago, Fred gave me a copy of his latest book (Property-Based Testing with PropEr, Erlang, and Elixir) so I could review it. So, here I am, returning the favor. But I’ll also use this chance to express some of my feelings and opinions about Property-Based Testing in general, since reading the book elicited quite a few of them. This will not be one of the usual articles on this blog, but I hope you’ll enjoy it anyway.
In a nutshell, this book is a very extensive and detailed manual/hands-on-tutorial with which you’ll first learn the general concepts behind Property-Based Testing (e.g. properties, generators, shrinking, etc.), then you’ll learn the basics of the methodology and tools and finally some of the more advanced techniques like custom generators, stateful properties and more.
Fred does a great job of walking you step by step and as you read the book, each chapter builds on the previous ones. But each chapter is just a tiny step that you can tackle easily. Suddenly, you reach the end of the book and you realize you learned a lot.
You should be aware of one thing, tho: It’s based on PropEr. You’ll find both Erlang and Elixir code samples, but all of them will use this framework. Same thing for the exercises. So, if you’re planning to practice what you read (which is something the book encourages you to do), you’ll be using it. Luckily, it’s an open-source framework so you can do that for free! 💪 That’s not to say that you can’t extrapolate what you learn to other frameworks, but it won’t be as easy.
My Personal Experience
I learned about Property-Based Testing at my University, almost 15 years ago. When I learned about it, the library that we used was QuickCheck (the Haskell version). At that time, I loved it. It seemed almost magical to me but I didn’t actually use it besides some examples and class assignments.
5 years later, I developed my thesis project also in Haskell. I had just learned TDD from Hernan Wilkinson and I thought what could be better than using Property-Based Testing for TDD? An amazing tool, combined with an incredibly effective technique…
My thesis project is still on GitHub, you can check its code and try to run its tests. I believe the last time I tried to run them it was still 2010… and they would probably be running still… 🤦♂
Fast-forward another 5 years and I was working with the Inakos. We’ve got ourselves a pretty fancy QuickCheck license and started trying to add stateful property tests to one of our biggest projects (which included an HTTP API built in Erlang with Cowboy). The result was a good set of properties that were a bit hard to read and understand and actually found… 0 bugs. 🤷♂
You can tell I learned a few lessons about PBT over the years…
First of all, I want to set one thing straight: I believe that the reasoning behind Property-Based Testing is sound. As Fred puts it…
[With Property-Based Testing] you’ll be able to write simple, short, and concise tests that automatically comb through your code the way only the most obsessive tester could.
Writing good properties and letting a good framework check them against your code will undoubtedly test your code much much better than any unit test you (or the most obsessive tester) can ever write. I don’t think anybody can deny that.
What stuff should be tested this way?
Is it worth it to test all your code with properties? I don’t think so. That’s the basis of my thesis project problem: I tried to use QuickCheck to test everything (including the GUI). That was not a wise choice, for various reasons:
- Property-Based tests are slower than unit tests. This is by-design since each property is tested a multitude of times instead of using just one example.
- Writing good properties is not easy. Fred states that multiple times in his book. Sometimes, a good unit test is not hard to write even if it covers just one possible scenario, but writing a property that captures that same thing in such a general way that you can have 10000 different instances with which to test the system is much more complex.
- Writing enough properties is not easy. In the same way that you can hardly ever be sure that you wrote enough unit tests to really cover all the functionality in the module/component/system you’re testing, it’s hard to be sure that you wrote enough properties for it. Fred does a great job of showing this situation in his book when he describes the different types of properties you can write, particularly when you’re not verifying your system behavior against a model. But you should also be wary of adding too many properties since each property takes time to run and you don’t want your suites to run forever…
So, what is worth testing with properties? I believe Fred summarized this idea quite well in this podcast when he said:
It comes down to figuring out what you want to write a test for first […] With Property-Based Tests this is really really hard because you need to find a general rule, but if you have something so stupid simple that the rule is the test itself you can not just do that.
You want to find something that is not trivial, but that you understand well and has significant complexity in its implementation. Usually data structures are interesting for that […].
From my perspective, there are things that are totally worth checking with properties and those are, in general, the ones used by Fred as examples in his book…
When you’re writing a module to manage a new complex data structure, like a new model for lists, a particular kind of trees, hash table, etc., writing unit tests will almost certainly miss a bunch of corner cases that property-based testing will not. Writing tests as properties will also help you define your module better and come up with a nicer API for it. Besides, as the book shows with list tests from OTP, property-based tests will use far less code, too.
Optimizations / Refactoring
One of the best scenarios you can find to use properties for testing is when you’re refactoring something that you can be sure that works correctly. Let’s say, you’re trying to optimize a certain function for performance or a module for readability, etc… Then you can use the previous version as a model and verify that your new implementation works exactly as the old one did. Writing properties will be at least as easy as writing unit tests (I think it will be easier since you don’t need to come up with the examples yourself) and it will give you much more confidence.
Libraries for Generic Processing
Much like data structures above, you can be writing a library like the CSV parser Fred presents in his book. Something where you don’t really know all of your use cases beforehand. Maybe you have a standard or RFC to guide you (That’s great, since encoding RFCs as properties is easier than coming up with general properties yourself) or some other properly written specification. Another thing that makes the use of properties easier is having complementary functionality (like encoding/decoding) so that you can write symmetric properties.
This is, at least in my experience, the most common place to reap the benefits from writing properties: In most of your systems, at some point, you’ll have to write an algorithmic piece that sits at the core of your system’s logic. You will likely won’t get this for free from a library and it might involve multiple data structures and/or some complex pieces. In his book, Fred presents the checkout code kata as an example of this. When your algorithm has to respond correctly to some parameters that may vary widely with quite a few edge cases, writing properties instead of unit tests definitely pays off. I would personally still write a unit test for the shrunk value produced each time the properties found a bug in my system, but that’s just me.
Complex Stateful Systems
Then you have Stateful Properties, which is a great way to test your system as a whole when it has multiple API endpoints that can be executed in different sequences and conflict with each other. So, if you’re writing a system where, as stated in the book…
“what the code should do” — what the user perceives — is simple, but “how the code does it” — how it is implemented — is complex.
…then you’ll get a lot of benefits from writing Stateful Properties. But if your system is big but not complex (like your usual CRUD HTTP server) or if “what the code should do” is really hard to model, maybe not so much.
What about your Systems?
Now the question is: How often do you write systems that are worth testing with properties?
How often do you create new data structures? While I work with opaque data structures almost every day at work, they’re generally flat (i.e. a map or a record with multiple fields and their accessors). I don’t regularly have to create a new type of trees or hash tables.
How often do you work on large enough refactoring/optimization tasks? Large /complex enough to merit adding properties to compare the new version with the old one thoroughly? Some people (I remember Hernán, for instance), may do that all the time… Me? Not so much.
How often do you write new generic libraries? I actually used to do that a lot at Inaka and I believe many of those libraries would certainly benefit from property-based testing, indeed. On the other hand, in scenarios like the one described by the book, where you find yourself writing a system that needs to parse a CSV file… well, I don’t think I will ever face that requirement and go “OK. I have to write a generic CSV parser now.”. I will either try to find a library for that or write just the code to do exactly what my system needs (i.e. parse a 3-column/0-peculiarities CSV file in the case of the book).
How often do you write complex algorithms or whole stateful systems? This actually happens more often than any of the stuff in the previous paragraphs, it happens to me at least once per system, maybe more. Of course, this only includes relatively small pieces of the systems I build, but I do believe that using property-based testing for them would be a nice addition.
In conclusion, as I see it: Property-Based Testing is a great tool that you should seriously consider using for those places where its benefits over traditional example-based tests outweigh its drawbacks (basically time consumed writing and running the property tests). But, as with any other tools, it shouldn’t be the only tool at your disposal and therefore I would not advocate for Properties-Driven Development.
If/when you decide to start using Property-Based Testing, you should totally read Fred’s book. It will guide you through the process and it will make it smooth and enjoyable. It won’t make you an expert in writing properties, but it will get you as far as a book can take you.
Finally, besides the main topic of the book and this article, I want to mention some other things related to the book:
- I still hate Erlang macros. If I ever start using PropER for Erlang, I’ll certainly be the weirdo that writes
?FORALL. PropER macros for Elixir are much nicer, though.
- Targeted Properties and Simulated Annealing are cool stuff to learn about, even if you won’t use Property-Based Testing that much. Don’t skim over that chapter in the book.
- When working with opaque data structures, the book simplifies a bunch of concepts (for instance, it generates objects of a type by building maps which is wrong since PropER should not know how that type is represented inside the module). Don’t learn all your lessons about opaque data structures from the book. It’s not a book about that :)
- Hidden in plain sight within the book you can find one of the best pieces of advice on how to structure your programs that I’ve ever read. There are several paragraphs and pictures about it but in a nutshell…
…side effects can be grouped together at one end of the system, and we can keep the rest of the code as pure as possible.
A reminder: Erlang Battleground is still looking for writers. If you want to join us, just get in touch with me (Brujo Benavides) and I’ll add you to our publication.
Spawnfest 2019 is coming! This year it will be on September 21st/22nd so start getting your team ready and brainstorming for ideas! Registration will open in a few months. And, of course, we’re always looking for sponsors!
In other news, ElixirConf is coming to South America! We’ll meet on October 24th & 25th at Medellín, Colombia with an amazing lineout of speakers, including Robert Virding, Verónica, Andrea Leopardi, Francesco Cesarini, Carlos Andres Bolaños and myself!! 😎
Find more info at https://www.elixirconf.la/ and start registering!