What is randomized testing?
Randomized testing (a subset of property-based testing) is a concept popularized by a library called QuickCheck. If you haven’t heard of it, I invite you to take a moment to familiarize yourself. Originally from Haskell and ported to many different languages, QuickCheck uses function input type annotations to generate randomized arguments. The developer defines a property that should hold for all inputs of a given type. QuickCheck will make hundreds of passes against the property in a single test run to ensure that it holds for every case.
You might say, “Well, I use Math.random to generate random inputs for my tests. Look at all this random,” which is missing the point because the “random” in randomized testing is a little more deliberate than that. Even if I know the type of input a function expects, there are different kinds of ways that I can test for that type. These are usually referred to as edge cases. You will not see much value out of generating random numbers if you don’t try different kinds of numbers: 1, 0, -1, 2, 3, 500.
As an example, let’s say you have a function which takes a array of strings as its only argument. QuickCheck would attempt to generate test input for a wide range of cases. Not only that, it will go out of its way to try many edge cases. The first thing it would try is an empty array. And then an array with one string. And then an array with one empty string. And then an array with two strings. And then… It deliberately tries to break your test hundreds of times in a single test run.
A traditional approach to unit testing can only be counted on to verify that specific inputs result in specific outputs. It does not make any guarantees about all inputs and certainly does not say much about the property which the test is attempting to verify.
More impressive, however, QuickCheck also implements a feature called “shrinking”. If it generates inputs that cause the test to fail, it will then attempt to reduce the input to a minimum failing case so that you can more easily fix your code. Yes, that’s right. Even after a failure it will continue to run tests to see whether it can produce a smaller, simpler failing input.
If you’re interested in learning more on randomized and property-based testing, this talk by Jessica Kerr is a good place to start.
As this fails when either
b is a negative integer,
testcheck.check might return something like the following.
It’s a really awesome tool. There is one problem with TestCheck though: the developer needs to manually create and maintain input generators. Given a complex enough domain model, this is a huge pain and prone to error. Here’s what a “Person” generator on an e-commerce application might look like.
One of my favorite things about statically-typed programming languages is that I can stop caring about the input and output types of my functions: static analysis has got my back. Compilation will fail if I’ve made a mistake. My test code is cleaner too. I can devote everything to testing behavior and not type checks.
Most of that “kind” of problem tends to fade away with comprehensive type checking a la Flow. With end-to-end type coverage, you can be reasonably certain that anything you would unit test is always going to return the correct type of outputs as long as they receive the correct type of inputs. In this way we can do away with checking for types or presence. Runtime schema checks at the boundaries of the program are good enough to further lock down those guarantees.
Of course, Flow is only a pre-runtime check and the developer should still write tests. However, given sophisticated enough input, unit testing can be ineffective for testing against edge cases. Instead developers will usually add a few regression tests to verify that their code won’t totally break on modification and leave it at that.
And now the snake oil
As a team we had been enjoying the benefits of very strict type coverage with Flow. We were also very confident in our randomized testing suite. However, we felt a significant amount of pain trying to keep our TestCheck generators up to date with type annotations. We began to question whether we might just create generators from the types themselves. And so our child was born.
flow-to-gen for Babel
babel-plugin-transform-flow-to-gen is a Babel transform that turns your Flow type annotations into TestCheck generators. After all, Flow can only conclude that your input types are correct. It cannot conclude that the program will run correctly.
We begin our treatise with an overview of how to integrate flow-to-gen with a test suite, then explain how it’s working under the hood, and finally cover some of its other advantages.
It’s best to make sure that Babel only uses the transform in test mode. Set up your config to look something like this.
npm i babel-plugin-transform-flow-to-gen --save-dev// oryarn add babel-plugin-transform-flow-to-gen --dev
Integrating with a test suite
In order to make use of proper shrinking in your tests flow-to-gen should be used in conjunction with TestCheck and all of its supporting libraries. In the examples below we use
jasmine-check to integrate with Jest. (Jest works with
jasmine-check because it is written on top of Jasmine.)
jasmine-check enhances Jest by adding an
check.it function that is similar to
it but also accepts a list of generators.
check.it uses the generators to execute your test callback with randomized values. The following example is similar to what we’ve seen before, but now running through the test suite.
Jest will run the callback for hundreds of different values. And once
b is randomized as a negative one, the test suite will blow up.
asGenerator static member that generates tuples of randomized test arguments for the function.
In the example below, we test a function
setName by verifying that for any random
string, we will copy the person and give her a new randomized name. (I always wanted to name my daughter “x5jdA”.)
setName.asGenerator function generates tuples of
[Person, string]. A single run of the test suite will verify that this property holds true for one hundred random cases.
Under the hood
As previously mentioned, the flow-to-gen runtime is mostly a wrapper around TestCheck. Therefore, all of the generated code can be used along with TestCheck or
Using type aliases
The beauty of flow-to-gen is that if you already have types, you do not need to modify any of your existing code to get going right away. You can treat your type aliases as if they compiled to functions. Generating sample information is easily done with the
sampleOne helpers which can be imported from
The flow-to-gen API also bundles with type helper functions (which are mostly wrappers around TestCheck generators) for creating your own runtime generators. There are a bunch of them which I’m not going to enumerate here, but the three we see in the example below are
types.plainObject for plain objects,
types.string for strings, and
types.number for integers.
Given all of this new information, I could have written the previous test as the following.
You may be wondering why
Person needs to be a function. Why can’t
Person be the generator itself? The reason is generics (related to parameterized types) which we will discuss next.
For our purposes, a generic can be thought of as an argument to a type definition. They can be used to reduce the amount of duplication in type annotations. Types with generics are an abstraction over an annotation.
Generators, both ones created by flow-to-gen and provided by the API, are composable. In the example below,
dogGen produce equivalent results. Because
Animal has a generic value, we have to provide it with a generator as an argument.
It’s also perfectly fine to use your own generators as arguments to generators.
The same principle also applies to function annotations that use generics.
Generating valid data
Sometimes it’s not enough to just generate a type alias with any random inputs. For example, consider that you might have a
User type whose
id is strictly in the form of a v4 UUID and perhaps your program makes a lot of assumptions about the shape of said UUID. The type
string is simply not enough. In such a case, you can forcefully override randomized input with your own. The
$Gen type allows you to do this. It’s a cute little trick to pass along information to the transform without breaking Flow.
Locking down your type definitions
An additional and originally unforeseen benefit of adding flow-to-gen to our test suite has been that sometimes the problem is with the type aliases (not poorly written production code or bad tests). Our types allowed for objects that we would never expect to see on a “valid” record. We only figured this out after inspecting some of the data generated from our type aliases!
For example, consider a scenario where a function input type is an array of strings (
string), but it is common knowledge amongst the team that this array would always be exactly three elements long. Under certain conditions — which TestCheck will test for — our function may blow up. Instead we should define the input as a triple of strings (
[string, string, string]). It’s easy to overlook such subtleties when the entire team “just knows” that something like this should be true. The change moves that knowledge to the codebase without the need for documentation.
Unit tests break with changes to type aliases
Your tests are generated from type aliases, so they may break when type aliases change. If you were expecting a particular property to hold, but it doesn’t, it could be a sign that your type definitions are not exactly as you intended them to be.
UI Fuzz Testing
If React component props are written in Flow, we can use them to render components with a higher degree of randomness and ensure that our UI doesn’t blow up on edge cases.
Generating garbage for
In order to encourage you not to use these loose types, flow-to-gen generates nonsense garbage whenever these types are used. If your function truly accepts
any then I’m happy for you. Enjoy your garbage. That said, you were probably just being lazy.
Automatic mocks with Jest
If you are running Jest as your test suite, flow-to-gen will automatically generate Jest mock functions in place of function types.
Using randomized testing works amazingly well for code that simply cannot have bugs. For us, this approach has found an unbelievable number of flaws in mission critical code. I would never consider going back to the old way of doing things.
But the feedback loop is very slow. For example, we have one very important function that tries at least 600 different sets of inputs on a single test run. With Jest, we can split the work out among processes, but it still takes about 12 seconds on my beast of a MacBook Pro. And there aren’t even network calls or database reads/writes.
This sucks if you’re using TDD. Fortunately enough, however, we don’t have to go all in.
jasmine-check only creates the
check.it function, but leaves
it unchanged. In this way, we’re allowed to decide which parts of our code need a more robust form of inspection.
Where to go from here
We’re using flow-to-gen to transform our type aliases into test generators, but there’s still a lot of work to be done. Take ES6 classes for example. We have no support for that. Flow global types. We have no support for those. React components. Not really. Recursive types. Those blow up. The list goes on. These are problems that we have not yet explored, but we would love to hear your ideas on how we can solve them.
Check it out at https://github.com/unbounce/babel-plugin-transform-flow-to-gen.
Thank you to Ray Huang, Roman Gonzalez, Emily Mears, Niall Lennon and Tavis Rudd for taking time to provide feedback on this blog post.