Preparing to refactor with better Unit Tests

Published in

Perplexinomicon

5 min readSep 7, 2020

TL/DR: The last time we looked at this code, we wrote property tests. The property tests used generators to create input, but the actually testing of the invariants was just regular Elixir code. In this post, we’ll look at rewriting those property tests as standard ExUnit tests, giving us better unit tests.

It’s odd when you put your thoughts out into the world, how people see things differently. I posted two articles last month, one of which I thought had some good ideas, and one of which was a dry and painful walk though my own code. Based on statistics, people prefer dry and painful.

Fortunately for me, also based on my statistics, I don’t do this for a living and can write whatever makes me happy to write. I don’t know if I could handle writing mirror to my code stuff all the time.

Today I want to finish the refactoring of the code that I property tested in the last article. But before I do, I want better unit tests.

The code in question is the partiphify!/2 code from the morphix library. Morphix is a small Elixir library of convenience functions for Enums. partiphify!/2 takes a list l and an integer k, and divides the list as evenly as possible into k sublists. In the language of property testing,Given an input list l and an integer k,the output o of partiphify!/2 has 4 invariants:

o is a list of length k.
Every item in l is in some list in o.
The sum of the lengths of the sublists of o is equal to the length of l.
No sublist of o has a length that differs by more than 1 from any other sublist.

There are some other invariants, actually. o must be a list of lists, for example. I’m fairly certain that’s covered by the other 4, but strictly speaking it might be better to actually state it. I’m not a terribly strict person, so I’m going to stick with the invariants I’ve already got.

The code I’ve currently got in the library looks like this:

def partiphify!(list, k) when is_list(list) and is_integer(k) do
    ceil_div = fn a, b -> Float.ceil(a / b) end

    with chunk_size when chunk_size > 0 <-
           list
           |> Enum.count()
           |> Integer.floor_div(k),
         true <-
           list
           |> Enum.count()
           |> Integer.mod(k)
           |> ceil_div.(chunk_size) > 0 do
      list
      |> into_buckets(k, chunk_size)
      |> distribute_extra()
    else
      0 ->
        list = Enum.chunk_every(list, 1, 1, [])
        empty_buckets = k - Enum.count(list)
        Enum.reduce(1..empty_buckets, list, fn _, acc -> acc ++ [[]] end)

      false ->
        chunk_size =
          list
          |> Enum.count()
          |> Integer.floor_div(k)

        Enum.chunk_every(list, chunk_size, chunk_size, [])
    end
enddefp into_buckets(list, k, chunk_size) do
    chunks = Enum.chunk_every(list, chunk_size, chunk_size, [])
    extra_buckets = Enum.take(chunks, -(Enum.count(chunks) - k))
    k_buckets = chunks -- extra_buckets
    {extra_buckets, k_buckets}
enddefp distribute(list, buckets) do
    Enum.reduce(list, buckets, fn item, buckets ->
      [current_bucket | rest_of_buckets] = buckets
      new_bucket = [item | current_bucket]
      rest_of_buckets ++ [new_bucket]
    end)
end

defp distribute_extra({lists, buckets}) do
    with false <- Enum.empty?(lists) do
      [current_list | rest] = lists
      new_buckets = distribute(current_list, buckets)
      distribute_extra({rest, new_buckets})
    else
      _ -> buckets
    end
end

If that code doesn’t immediately make sense to you, that’s okay. I dissect it in a previous post. It made perfect sense when it was being written, and it’s written this way in order to avoid as many calls to ++ as possible. We’re going to refactor it to avoid any calls to ++.

But our unit tests are still in our way. Here’s an example unit test for this method:

iex> Morphix.partiphify!([:a, :b, :c, :d, :e, :f], 4)
[[:c], [:d], [:e, :a], [:f, :b]]

All the unit tests for this method are actual DocTests. Not good practice, but what’s even worse is that any change to the method that alters the order in which the output occurs will cause the test to fail, even though the order in which the output occurs is really not important at all.

DocTests are documentation. It’s in the name. They aren’t tests at all, really. So before we refactor, can we write some unit tests that actually test the method?

Let’s write some unit tests that test our invariants:

Invariant 1: given a list (l) and an integer (k), the output should be a list of length k.

We can probably write a test for that. Here is a test that is that test, but it isn’t a very good test:

test "the output list should be the same lenght as the input integer" do
    k = (1..100) |> Enum.random()
    l = (0..1000)
    |> Enum.to_list()
    actual = Morphix.partiphify!(l, k)
    assert Enum.count(actual) == k
end

We could improve the test by covering some more cases, for example, we probably want to handle the empty list and a few numbers that are larger or smaller than our value for k.

test "the output list should be the same length as the input integer" do
    k = (1..100) |> Enum.random()
    [0, 10, 13, 111, 683, 1013]
    |> Enum.each(fn length ->
      l = random_list(length)
      actual = Morphix.partiphify!(l, k)
      assert Enum.count(actual) == k
    end)
enddefp random_list(length) do
    (0..length)
    |> Enum.map(fn _ ->
      random_item()
    end)
enddefp random_item() do
    [random_string(), random_atom(), random_int()]
    |> Enum.random()
enddefp random_string() do
    [Faker.Lorem.sentence(), Faker.Lorem.word()]
enddefp random_atom() do
    Faker.Lorem.word()
    |> String.to_atom()
enddefp random_int() do
    :random.uniform(10000)
end

So we have a little randomization, and we’re working through a few different possible examples. This test passes with our current implementation, and we can make it break by changing the implementation to modify the value of k.

Invariant 2: given a list (l) and an integer (k), every item in l should be in the output.

We can leverage the work that we’ve already done to write this test.

test "all the items in the input list are in the output list" do
    k = (1..100) |> Enum.random()
    [0, 10, 13, 111, 1013]
    |> Enum.each(fn length ->
      l = random_list(length)
      actual = Morphix.partiphify!(l, k)
      assert Enum.all?(l, fn item ->
        Enum.any?(actual, fn sublist ->
          Enum.member?(sublist, item)
        end)
      end)
    end)
end

Breaking this test requires modifying our return value. I split the return list and returned the tail, which gave me a failing value. I like to see things fail.

Invariant 3: given a list (l) and an integer (k), the sum of the lengths of the output is equal to the length of l.

Again, our prior work helps us write a test here:

test "the sum of the lengths of the outputs is equal to the length of the input" do
    k = (1..100) |> Enum.random()
    [0, 10, 13, 111, 1013]
    |> Enum.each(fn length ->
      l = random_list(length)
      actual = Morphix.partiphify!(l, k)
      assert actual
            |> Enum.map(fn l -> Enum.count(l) end)
            |> Enum.sum() == Enum.count(l)
    end)
end

Invariant 4: No sublist of the output has a length that differs by more than 1 from any other sublist.

test "the lengths of the partitions are within 1 of each other" do
    k = (1..100) |> Enum.random()
    [0, 10, 13, 111, 1013]
    |> Enum.each(fn length ->
      l = random_list(length)
      actual = Morphix.partiphify!(l, k)      {min, max} = actual
                   |> Enum.map(fn l -> Enum.count(l) end)
                   |> Enum.min_max()      assert abs(max - min) <= 1
    end)
end

Are we ready to refactor yet? Finally, yes. A key point here is that, if, like me, you’re a proponent of test driven development, you might be thinking that we should have been ready to refactor as soon as we wanted to, because our tests should have already been there to allow us to refactor with wild abandon.

Exactly.

Preparing to refactor with better Unit Tests

Written by J Paul Daigle