Preparing to Refactor with Property Tests

Published in

Perplexinomicon

9 min readAug 16, 2020

TL/DR

If you can think in properties, with or without a property testing library, your tests will support refactoring rather than being a barrier to it.

I didn’t figure out what this post is about until I finished it, but here it is: This post is about why and how Unit Testing can be broken in your application. It’s about the difference between good DocTests and good Unit Tests. And ultimately, it’s about thinking in properties. We take a while to get there, and there are some detours, but ultimately, that’s where we’ll land.

So a bit ago I wrote about the partiphify method, a part of the morphix library for Elixir. My conclusion was that I wasn’t particularly happy about the code I originally wrote for that method. It seemed clumsy. So I wanted to refactor it.

But that led to another problem, which is example code, and unit testing. What the partiphify! method does is to take a list and divide it up into a fixed number of sublists of equal length, or, as equal length as possible. Given a list like [1, 2,3,4,5,6,7] and the integer 3, partiphify! should return three lists, which might be [[1,2,3], [4,5], [6,7]], but not [[1,2,3],[4,5,6],[7]]. The second partitioning doesn’t keep the lists as close in size as possible, but the first partitioning does.

The thing about writing example code and unit tests for this kind of a method is that implementation matters. Here’s an example from the DocTests:

iex> Morphix.partiphify!([:a, :b, :c, :d, :e], 4)
[[:b], [:c], [:d], [:e, :a]]

Any implementation that ordered the list differently (for example [[:a,:b], [:c],[:d],[e]], would fail that unit test, even though the implementation might be correct in all the ways that matter. Which is okay in a doctest, which is more of an example than a test, but not great in a unit test.

Which made this a great time to add property tests to the library. A property test is a test where the developer determines things that are true about the code (these are called invariants) and example inputs are generated by the testing code and tested against these invariants.

So the first task involved in writing a property test is to figure out what your invariants are. In the case of partiphify!, we know that the function should take two arguments, a list and an integer, and it should divide the list into k sublists that are as close in size to each other possible.

Thinking about it a bit, we (hat tip to Jessica Kerr) came up with a list of 4 properties that make up the partiphify! function, given that there is an input list l, an input integer k, and an output value out.

The length of out should be equal to k.
Every item in l should be in out.
The sum of the lengths of the items in out should equal the length of l.
The length of each list in out should be at most +-1 of any other list in out.

And that’s it. If those properties match, then we know that we have divided our input list into partitions correctly. All the items are in the output, no item is in more than one sublist, we have the right number of sublists, and the sublists are balanced.

I wrote property tests, initially, using PropCheck, and then later rewrote those same tests using StreamData. PropCheck is a property testing library for Elixir that provides a wrapper around PropER, a property based testing library for Erlang.

To set up either library, you first want to add the library to your deps(), then add the application to your extra_applications: [] or applications: [] Keyword lists. I wanted to make sure these only ran in test, so for my application the changes to my mix.exs looked like this:

def application do
  [applications: applications(Mix.env())]
enddefp applications(:test) do
  applications(:default) ++ [:stream_data, :propcheck]
enddefp applications(_) do
  [:logger]
enddefp deps do
  [
    {:excoveralls, "~> 0.8", only: [:dev, :test]},
    {:ex_doc, "~> 0.18", only: :dev},
    {:credo, "~> 0.9.1", only: [:dev, :test]},
    {:stream_data, "~> 0.5.0", only: [:dev, :test]},
    {:propcheck, "~> 1.1", only: [:dev, :test]}
  ]
end

I have the changes for adding both StreamData and PropCheck here, but really you only need one of them.

In your test itself, you’ll want to add the dependencies:

defmodule PartiphifyPropTest do
  use ExUnit.Case
  use PropCheck

Or:

defmodule PartiphificationTest do
  use ExUnit.Case
  use ExUnitProperties

So for our first property, this is the test I came up in PropCheck:

# number of partitions should be correct
property "number of partitions is correct" do
  forall {list, p} <- {list(), integer(1, 10)} do
    partitioned = Morphix.partiphify!(list, p)
    Enum.count(partitioned) == p
  end
end

The property block acts like a test statement. The first line is the forall statement, which names two variables, list and p, and assigns generators to those variables. The first generator is list(), which will produce lists of any valid term, and integer(1,10), which will produce integers in the range of 1 to 10. The general form of this statement is

forall {var_1, var_2, …, var_n} <- {generator_1, generator_2, ..., generator_n}
....
end

Inside the forall statement, we want to have code that results in either a true or a false value. In this case, once we’ve partitioned the input list, we want to see that the number of partitions is equal to the input integer.

In StreamData, the test looks like this:

property "number of partitions is correct" do
  check all list <- list_of(term(), min_length: 0, max_length: 100),
            part <- integer(1..27) do
    assert list
           |> Morphix.partiphify!(part)
           |> Enum.count()
           |> Kernel.==(part)
  end
end

So, again, we have a property statement, and inside that statement we have a check all statement instead of a forall statement. The check all statement adds variables one at a time rather than in tuples. So in this case, the first variable will be list, which is generated by list_of(term(), [options]), so will be lists of between 0 and 100 of type term() , and part, which is generated by integer(1..27). The general form of this statement is:

 check all var_1 <- generator_1,
           var_2 <- generator_2,
           ...,
           var_n <- generator_n do
    ...
end

Inside the check all statement, we use assert to test the property. The code here looks a little different because I’m using a pipechain, but all the same code is being called in the same order as in the PropCheck example.

Our second invariant is that every item that is in the original list is in the partitions, somewhere. For this property, the generators are the same, but the code that checks the invariant is fairly complex and might be worth looking at:

assert Enum.reduce(list, true, fn list_item, acc_a ->
  in_a_list = 
    Enum.reduce(partitioned, false, fn partition, acc_b ->
      Enum.member?(partition, list_item) || acc_b
  end)  in_a_list && acc_a
end)

Warning: In this next section I go pretty deep in the reduction logic for this invariant. I’m doing this mostly because using Enum.reduce/3 is not always intuitive for me, and I found it initially hard to wrap my head around it for uses other than adding up numbers. If you find the invariant code blindingly obvious, feel free to skip the next few paragraphs.

Let’s look at this code in detail.

Starting with the outside reduction, we’re reducing on true, and we can simplify a bit… let’s say I wanted to establish that every integer in a list of integers was greater than 10. I could write something like this:

small_integers = Enum.filter(list, fn i -> i <= 10 end)
less_than_tens = Enum.empty?(small_integers)

and less_than_tens would be either true or false. Or I could write this:

less_than_tens = 
  Enum.reduce(list, true, fn i, acc -> i <= 10 && acc end)

Where once some integer has flipped the accumulator from true to false, it will remain false for the rest of the list.

The outside reduction does just that, we look at each item in the input list, and check to see whether it is in a partition. If it is not, the value in_a_list && acc_a will become false, and it stays false for the remainder of the list.

What about finding the value of in_a_list? In this case, I have a list of lists, and I want to see if some value is in one of those lists. There are a lot of ways to do this as well. For example, I could write:

(i is some integer we're checking)
lists_of_bools =
  Enum.map(list_of_lists, fn list -> Enum.member?(list, i))
in_a_list = Enum.member?(list_of_bools, true)

If true is a member of our list of bools, than we found i in some list.

Alternately, I can use a reduction:

in_a_list = 
  Enum.reduce(list_of_lists, false, fn list, acc -> 
    Enum.member?(list, i) || acc
  end)

In this case, because of the || statement, once the item is found in a list, the accumulator becomes true, and it will remain true even if the item is not found in any subsequent list.

End of Warning Section.

So our outer reduction looks at each item in the input list, and if every item is found in any partition (by the inner reduction) it will return true as it’s accumulated value. If any item is not found, we return false and the invariant will fail. Notice, this invariant would pass under a number of bad implementations. For example, if partiphify had the following implementation:

def partiphify!(list, k), do: [[list]]

Then this invariant would pass. Similarly, if the implementation were:

def partiphify!(list, k) do: (1..k) |> Enum.map(fn _i -> [] end)

Then the first invariant would pass.

def partiphify!(list, k) do
  emptys = (2..k) |> Enum.map(fn _i -> [] end)
  [list] ++ emptys
end

Would pass both of these property tests, and the third invariant as well. The only invariant it would fail would be the fourth one, because the differences in sizes between partitions would exceed 1. I could pass that invariant and the 1st and 2nd invariants with this code:

def partiphify!(list, k), do: (1..k) |> Enum.map(fn _i -> [list] end)

Now all the partitions are the same size, and I have the right number of partitions, and every item in the input list is in some partition. But I’ll fail the third invariant, because I’ll have k * Enum.count(list) items if I add up the count of the partitions, and I only want Enum.count(list) items.

Each invariant limits the implementation in different ways, and each of them limits the implementation in ways that are relevant to the function. Contrast this to our DocTest:

iex> Morphix.partiphify!([:a, :b, :c, :d, :e], 4)
[[:b], [:c], [:d], [:e, :a]]

The DocTest gives an output example that meets all four invariants, but there are a lot of irrelevant factors in it as well. It tests ordering, and grouping, and the ordering inside groupings. We don’t care about any of those things, and there are many correct implementations of the method that would fail this test. Which means that rather than being a help to refactoring, this test stands in the way of refactoring!

An interesting question is whether we could write better unit tests around this code that would help us refactor. And of course, we could test these properties without using a property test library. We could even add some randomness to the test.

test "the number of partitions should be correct" do
  list = [1,2,3,4,5,6,7,8,9,10]
  k = (1..27) |> Enum.random()
  parts = Morphix.partiphify!(list, k)
  assert Enum.count(parts) == k
end

With a little work, I could (and might) generate random lists as well. I could even change the implementation to go through every number from 1 to 27 against every one of a number of randomly generated lists, or hit some set of numbers I though was import (1, the list count / k, list count-1, stuff like that). Once I built those cases for number of partitions, it would be a simple matter to test the other invariants.

These would be better Unit Tests. They might not make great DocTests, but they would be much better unit tests than testing that the input exactly matches the output. A good DocTest should be a simple example that tells people what will happen if they use the function being documented. In Elixir, it’s there to clarify and support your documentation. It’s not there to test your code.

A good Unit Test is there to test your code. So the goals of the two tests are not the same. Something I’ve often heard and have sometimes said is that “If your tests stand in the way of refactoring, they probably aren’t very good.” You’ve probably also heard “Don’t test the implementation” when writing unit tests. But what does the second statement mean, and how do we avoid having tests stand in the way of implementation?

I think the answer is this: “Try to think in properties, not examples”. That doesn’t mean adding a property testing library to your code. It means understanding what your invariants are for a piece of code, and attempting to test those invariants. It means not testing things that are not invariants as well!

.If you can think in properties, with or without a property testing library, your tests will support refactoring rather than being a barrier to it.

Preparing to Refactor with Property Tests

TL/DR

Written by J Paul Daigle