The (not so) Magic Tricks of Testing in Elixir (1/2)
As shown by André Albuquerque’s post, Onfido is betting big on Elixir. It has been a great journey, shifting our mindsets from Object-Oriented to Functional Programming. It’s incredible to be able to leverage the decades of engineering work that have been put on the BEAM VM, and it’s even better to top that up with the productivity tools that Elixir gives us.
However, one of the main pain points we’ve felt, when making this transition to Elixir, is related with testing. Tests are an integral part of any application. They not only serve as great living documentation, but are also our safety net when we want to refactor our applications. Thus, it is paramount to be able to test Elixir applications properly. In this post I’ll describe some of the obstacles we’ve found in this odyssey, how we’ve come across them and how we currently test Elixir applications at Onfido.
Disclaimer: What I’ll be talking about only applies to ExUnit. You can use other test frameworks (such as espec) and you probably won’t feel the pains that I’ll be describing here. While I’m not directly advocating for ExUnit, I think it’s extremely valuable to stop and think about some design decisions behind it — especially when you’ve been using testing frameworks that are very different.
DAMP not DRY!
This is a very famous phrase in the testing community. To follow this principle, instead of trying to remove the duplication in the tests (and follow the Don’t Repeat Yourself principle), one should rather have Descriptive And Meaningful Phrases on the tests. The importance of this is twofold:
- Tests are a great source of living documentation. They are a great way of conveying the author’s intent, and also make much clearer what’s the purpose of a certain module or function. Thus, test code should maximize its readability;
- Test code is untested code. As such, one should minimize the logic in it and try to keep it as simple as possible.
Let’s see this principle in action with a code sample:
In here we have two
describe blocks, which share a similar set-up phase. They both call the
address_already_validated function. When looking at this sample, most developers will have the urge to remove this duplication, and put an outer
describe context with the
address_already_validated set-up block.
While this may seem harmless in this simple example, it becomes a huge problem as your application gets bigger and bigger. The goal here is that you should be able to look at a single test and instantly realise what’s going on, without having to jump around just to understand what’s this test doing (if you come from Ruby like me, I’m pretty sure you’ve felt the pain of jumping into a project, look at the specs and have a brain stack overflow just trying to figure out all contexts, and shared contexts relevant to that test).
This descriptiveness is actually enforced in ExUnit, since you can’t create nested
describe blocks (José Valim explains the rationale behind this here). While this looks like an insignificant design decision, it’s a concrete example of Elixir striving for the long-term maintainability of a project (and thus long-term productivity of its developers).
Detroit-school TDD vs. London-school TDD
I believe that most of the difficulties that I’ve felt can be explained by the differences in philosophy regarding testing.
Detroit-school TDD is the classical one, created by Kent Beck and others in the 90s. This type of TDD tries to maximize the regression safety net introduced by tests, and does that by minimizing the use of test doubles. However, this will inevitably lead to redundant coverage (and all the problems that come with it). Also, the design feedback is weaker when practicing this type of TDD.
In the London-school TDD, we isolate all the dependencies and focus only on the subject under test. For this reason, followers of this school usually think that these are the TRUE unit tests, since you’re only exercising the code that’s under test. This type of TDD usually yields a higher design feedback, but they have to be complemented with integration tests, to ensure everything is glued together as it should.
When talking about Functional Programming languages, it’s more usual to see practitioners of the Detroit-school TDD, since we strive for pure functions whenever possible (and thus minimize side effects), and it’s common in a unit test of a certain module to just let functions from other modules run freely.
Since I’m a practitioner of the London-school, I want to create mocks even when the module/function being called doesn’t have any side effects. This was the biggest pain I’ve felt, since I wanted to have mocks but also run my tests concurrently. Throughout the rest of the post I’ll describe the strategies I use to test Elixir applications.
The magic tricks of Testing
In her talk in 2013, Sandi Metz comes up with the following matrix that describes how to create tests that aren’t redundant (if you haven’t watched this talk I highly recommend it!).
The Incoming Query and Command do not require the creation of mocks, but I’ll quickly show an example of each one. Then, the Outgoing Command is where things get a bit trickier, and that’s where we’ll need to create mocks.
Let’s say that we are creating tests for this module:
This is an address validation module, which just validates that a certain address has less than 32 characters. Also, let’s say that the
validatefunction emits an event (our side-effect). This module also reads and writes
:validation_levelto the application configuration — this will not be used and it’s just here to serve as an Incoming Command, but it could describe the level of validation we would apply (e.g. a
thorough validation would do more than validate the address length).
As we can see in the matrix depicted above, in this case we just want to assert on the result of the function that’s being tested. This means that we simply run the function and set the assertion according to the provided arguments. This one was really easy!
In this case we want to test the direct public side effects of running the command.
Now we’re not testing the
validate function but the
set_validation_level function. We run the function under test, and then observe the public side effects of running this command, by running the
validation_level function. This is what we’re interested in this type of tests. Note that in this case, since the message is incoming, you should NOT expect
Application.config to receive
put_env with the right arguments, as this would be leaking the implementation details to the test. This one was fairly easy as well! Moving on to the last one.
We now want to make sure that the command gets called, with the right arguments. However, before touching the test code, we need to change our production code! We need to inject the dependency (
Demo.Events.EventEmitterin this case). We do that by defining a new module attribute:
and then use it on the function that emits events:
Now, we’re able to inject a mock module when running in a test environment (i.e. in
config/test.exs we define the
Demo.Events.EventEmitterMock). Here’s the code for the mock module:
We’re essentially using messages between processes to ensure that the side-effect is being triggered. This process sends a message to itself (the first argument passed to
send), which will allow us to check for it later in the test. Notice the pattern-match on the function argument, which ensures that the outgoing command was called with the right arguments.
This has the advantage of creating mocks with explicit contracts that are easy to reason about and put in context. The disadvantage of this method is that the test logic is spread out between files, which makes it harder to comprehend.
With this in place, the code in our test file is really simple:
As I’ve mentioned earlier, we’re using messages to test that the command gets called. As such, in the test we simply have to check the mailbox of the current process, and assert that we received the expected message.
However, this approach stops working when the code under test is spawning a new process, which is fairly common in Elixir. For instance, our
Demo.Events.EventEmitter could be behind a worker pool (such as poolboy), and this way our approach of sending messages to
self() would no longer work.
Turning the Mock Module into a GenServer
The cleanest solution I’ve found to this issue is to transform the mock module into a
GenServer. Here’s the code for our new mock module:
Note: This GenServer is not implementing the client facing functions (that wrap the
handle_call functions) because in this example we’re just using
Genserver.call directly on line 9 of the code example below.
Now, the mock module keeps track of the processes that subscribed to its events. Then, upon receiving the
:emit_eventcall, it broadcasts the message to all of the subscribed listeners. This entails that the test process needs to subscribe to this
GenServer. Actually, that’s the only change that needs to happen in the test code:
We just need to define our
setupbefore running the test. If we were indeed using the
poolboy library, this is how we would subscribe to that mock module (assuming that
:demo_pool was properly configured in
In this post we’ve seen how Elixir has some interesting philosophy regarding testing, and also how we can properly test Elixir applications, particularly when there’s mocks involved.
In the second part of this post we’ll be discussing some of the shortcomings of the solutions I’ve presented above, and how to overcome them. Namely:
- The testing logic is spread out across different files, which makes them hard to reason about;
- These solutions still allow us to create ad hoc mocks (i.e. mocks that aren’t based on behaviours).
Be on the lookout for the second part!
Note: This post is an adaptation of a talk I gave at Lisbon |> Elixir meetup.