An Introduction to Fuzz Testing
Moving Beyond Unit Tests
What is a fuzz test?
A fuzz test, also called an invariant test, is a test where random inputs are provided to a test and the developer asserts that some property, some invariant, holds true.
This is best understood by example. Consider a function that accepts a list as input and returns a reversed list.
Given a list of type
a this function will return a reversed copy of that list.
The traditional way to test this function is for the unit test author to make up a mock array or two and make some assertions. But if we were to author a fuzz test we would approach it differently.
Given any random array, what should hold true when using
reverse in a test? One invariant that should hold true is that if I were to take an arbitrary list and pass it to
reverse, and take the output of that and pass it to
reverse again, then I should have the original list.
There are a couple of neat things to note here.
- Many fuzz testing libraries by default will produce 100 different random inputs to each fuzz test. In effect, the above code represents 100 test assertions because 100 different random lists are being tested. This makes finding edge cases far easier.
- We don’t have to write any mock data by hand. Notice that I didn’t create any made up lists from scratch. When it comes to fuzz testing, rather than making up mock data you are instead composing generators to produce random data in the desired shape.
Fuzz testing allows you to focus more on expected behavior and less on coming up with and maintaining test data.
elm-test shows us that this test passes.
A third aspect to fuzz testing is shrinking. Let’s deliberately break this test.
Here I’ve updated the implementation of
reverse to use
List.sort which violates the double reverse invariant. The test results reveal something incredibly useful.
Elm-test not only found a list that made the test fail, but once a failing list was found it started shrinking it until it found the smallest possible list that still failed the test. In this case, the lists
[0,-1] were the smallest possible values to fail this test.
As you might guess, shrinking is dependent upon the type of data being tested. Booleans shrink to false. Integers shrink to 0. Strings shrink to shorter strings.
Fuzz Testing a QuadTree
Let’s step it up a notch and move on to testing a QuadTree. A QuadTree is a spatially aware data structure. It’s a tree where each node has exactly four children or the node is a leaf node. QuadTrees are commonly used in image processing, collision detection, and other applications.
To further elaborate, if we had points randomly positioned in 2D space those points might look like the following:
If we wanted to store those points in a QuadTree that QuadTree would look like this:
The two dimensional space is broken into directional quadrants and this happens recursively until all points are assigned to a leaf node. The underlying code for modeling the QuadTree is the following:
If you hand rolled your own QuadTree implementation you’d certainly want to test it. Realize that writing traditional unit tests for this data structure is especially tedious because you need to, by hand, create many mock QuadTrees and hope you’ve written enough to catch all edge cases.
Fuzz testing a QuadTree is entirely different. Rather, in order to fuzz test the QuadTree we start off by asking what invariants does a QuadTree have? This is a useful question to answer generally but it’s especially important with fuzz testing.
So what are the invariants of a QuadTree?
The obvious one is points should be contained within their correct quadrant. That is, points in the west quadrants should have smaller x values than points in the east quadrants. And points in the north quadrants should have smaller y values than points in the south quadrants, assuming (0,0) is at the top left of the screen.
If we phrase that in fuzz testing terminology, given a random QuadTree containing random points..
- All x coordinates in the northwest quadrant should be less than the x coordinates in both east quadrants
- All y coordinates in the northwest quadrant should be less than the y coordinates in both south quadrants
- All x coordinates in the northeast quadrant should be greater than the x coordinates in both west quadrants
- All y coordinates in the northeast quadrant should be less than the y coordinates in both south quadrants
- All x coordinates in the southwest quadrant should be less than the x coordinates in both east quadrants
- All y coordinates in the southwest quadrant should be greater than the y coordinates in both north quadrants
- All x coordinates in the southeast quadrant should be greater than the x coordinates in both west quadrants
- All y coordinates in the southeast quadrant should be greater than the y coordinates in both north quadrants
Seeing it listed out like that may make it seem a little tedious, something that was just pointed out that we’re trying to avoid. But this won’t be too bad. We’re going to write a single fuzz test that makes 8 assertions. This test will be provided with 100 different random QuadTrees automatically and result in 800 individual assertions!
Before we begin go ahead and take a glance at the full QuadTree code. Warning! This code has a bug. See if you can find it!
Take note that on line 1 even though the
QuadTree type is exposed, the only way to create a
QuadTree is by invoking the
fromList function. What this means is that rather than creating QuadTrees from scratch we’ll instead be creating random 2D point list generators that will later be passed to the
We could actually create a
quadTreeFuzzerbut then we don’t get the benefits of shrinking.
The above code defines two fuzzers, or generators, capable of producing random values:
- pointFuzzer - A generator that produces random 2D points with coordinates that range from (-20, -20) to (20, 20). I intentionally constrain the range of values to make the chance that duplicates are produced somewhat common.
- pointsFuzzer - A generator capable of producing a list of up to 1000 points. This list of points will ultimately be provided to
fromListto produce QuadTrees that represent up to 1000 points.
Lastly, I want to point out an insight that we can take advantage of. A QuadTree is a recursive data structure. Each quadrant within a QuadTree is itself a QuadTree. What this means is that the invariants that hold true for the top-level QuadTree should also hold true for each of that QuadTree’s four quadrants, and also for each of those quadrants’ quadrants, etc.
Translating that into plain English, we can write a recursive fuzz test that takes a QuadTree as input, makes the 8 invariant assertions listed above, and recursively calls itself with the northwest, northeast, southwest, and southeast QuadTrees. That will guarantee that all points within a QuadTree are where they should be.
Alright, finally on to the fuzz test..
This takes a little bit of digesting, but I’ll walk you through it.
pointsOrderTestis a fuzz test. The function reads “given a random list of points, create a QuadTree from those points and recursively test the invariants of that QuadTree”.
recursivePointsOrderTestis a recursive function that takes a QuadTree as input and returns a list of Expectations.
- First, this function extracts out all of the points in each of the four quadrants. Then it makes the 8 assertions that we identified earlier. For example, all northwest x coordinates should be smaller than all northeast x coordinates in this random QuadTree.
- Those 8 assertions are added to a running list of even more assertions. Those other assertions are the result of recursively calling the same function four more times, once for each quadrant of the current QuadTree.
This code will recursively walk down the QuadTree and make sure that every point is exactly where it should be in that tree.
When the tests are run it produces the following output:
Ah ha! A bug was found!
What these results are telling me is that the QuadTree that results from several random lists of points has an invariant being violated. Specifically, there is a point in some northwest quadrant that has a y coordinate larger or equal to another point in the corresponding southeast quadrant, and the smallest possible list to produce that failure is provided. That’s really cool!
Upon closer inspection of the shrunken lists we can conclude that two points with identical y-values are being added to the QuadTree incorrectly. But that’s not all. This bug only happens when the first of the two points has a larger x-value.
As it turns out, this bug is in the
else if line should be..
Missed an equal sign. Honestly, the data required to produce this bug is obscure enough that it probably would have fallen through the cracks until someone found it in production someday in the future.
After that line is corrected we get not only successful test results but also the confidence that this QuadTree is getting populated correctly.
Summarizing the differences with unit testing
In summary, fuzz testing is distinct from unit testing and those distinctions were peppered throughout this post, sometimes implicitly. However, they are worth highlighting because I really do think that fuzz testing is a superior testing strategy for many situations.
- When unit testing you create mock data by hand. Not only can this be extremely tedious but it’s also prone to missing edge cases. It can also be painful when refactoring your model in the future.
- When fuzz testing, you define the shape of your data and you write a test in the form “given some random input, what property or invariant should hold true?”.
- Fuzz tests use generators to produce random values. If you’ve modeled your data in a primitive hell kind of way then you’ll miss many edge cases because there are too many combinations of those primitives to cover within 100 random permutations. This actually forces you to rethink data modeling. To fuzz test effectively is to remove invalid combinations of states from your model and API, something that has benefits outside of testing.
- From personal experience, I’ve seen unit testing destroy module boundaries. A function that should be private and encapsulated suddenly is made public by a developer because that function “needs to be unit tested”. Modules have meaning. Modules encapsulate certain behavior and implementation details that sometimes shouldn’t leak out. The above QuadTree example demonstrates that even though some parts of the module are not exposed, that module can still be tested effectively.