How Do I Test?

Inspired by the blog posts from ikura and Finch, I thought I would write my own article about how I test my Erlang systems.

Honey, I Shrunk the Kids (1989)

In a Nutshell

The TL;DR version of this article is:

I practice TDD, I tend to use black-box tests written with Common Test, I always keep my test coverage at 100% and I add Meta Testing to all my projects.

Ok, now that I’ve lost most of my readers, let me go deeper into this subject which, I have to say, is one of my favorites…


Let’s start by talking a little bit about what reasons I have to test my systems the way I do.

Let me start by telling you that I learned to work this way in my OOP course at UBA. The course was based on Smalltalk, and it was delivered by one of the best teachers I ever had: Hernán Wilkinson. In Smalltalk, the whole image (remember: the code, the VM, even your editor… they all live in the same image) is designed to be used in a TDD fashion, with lots of tools that simplify your life if you work that way. So… I learned to work that way. Later on, when I started working with Erlang I found that even when it was not as easy as it was in Smalltalk, it was entirely possible to keep using TDD. And that’s what I did!

But that’s my story, why would you use TDD when writing Erlang code? Well, let me tell you a couple of things I consider valid reasons for that:

If you’re testing it anyway, why don’t you write a test for it?

Most of the times, writing tests in Erlang is not difficult. It gets tricky when multiple processes are involved and you have to deal with concurrency and what-not. But particularly in those situations, even if you don’t follow TDD, you usually test your code. You don’t just write the code and let it be, you open up a console, compile your module (not necessarily in that order) and write some commands to verify that your functions actually do what you want them to do.

If you don’t have erlang-history installed (and you should!), the second you close that console you loose all the expressions you evaluated. In order not to loose them, you copy them in a text file somewhere. In that case, why don’t you just put them inside a function (or multiple ones) and pattern match their results? If you do that, you suddenly have a test! That requires almost no additional effort but now that test is easily repeatable. That’s a huge win.

Refactoring Code

Being able to confidently refactor your code is one of the main reasons to write tests in the first place, regardless of the language you choose. This subject was covered extensively on the internet already, I won’t go in detail here. If you want just one link to read, check StackOverflow.

Tests as Specs

Usually the main purpose of the tests I write is to specify the expected behavior of my system. I write the tests to say “This is what this function should do”, “This is how this module should work”, “This is what this API must return in this scenario”, etc. Later on when I wonder What did I write this function for? I want to be able to go to the test that exercises it and find the answer.


Most of the code I write this days belongs to open-source libraries. It’s crucial for those projects to be thoroughly tested. That’s because they will be used by people that will rely on them. As an open-source user, I always tend to trust the libraries I use. If there is an error, the libraries are the last place I consider debugging: my code goes first, of course. So, as an open-source author, I like to be able to confidently state that the libraries that I provide do work as intended. Note that having tests doesn’t automatically ensure the absence of bugs, but at least having tests ensure that, if my users use the library like the tests, they’re on the safe side.

What do I do?

Enough with the intro! Let’s talk about how I actually test my Erlang code.

Simple Functions

As an Erlang trainer, I’m always faced with the task of assigning problems to my students so they can test their recently acquired knowledge. And then I have to check that their solutions actually meet the goals, right? Well, maybe not

What I usually do is this: Besides the exercise statement, which is usually written in a PDF or other document, I build a simple function (called test/0) with them. They can use that function to test the code they write on their own before showing it to me.

For example, let’s say the exercise asks for a function temp:f2c/1 that converts Fahrenheit degrees into Celsius. In that case I would give my students something like this:

That way they can try to compile the module and evaluate temp:test() in their Erlang consoles until it returns ok.

I’m taking advantage of 2 key parts of the Erlang language here:

  • Pattern Matching: I’m matching the results of calling the function I want to define to proper patterns, knowing that any matching error there will break the test with a descriptive enough error message.
  • Dynamic Binding: The temp module compiles even when f2c/1 doesn’t exist. That’s because Erlang has dynamic binding for fully qualified functions. It will only check if a function actually exists in runtime. If it doesn’t exist (and if you haven’t used Ghost Functions), it produces a runtime error thus breaking the test, as expected.

Simple Modules

Taking a step forward, consider an exercise that requires students to provide a full module, let’s say… a simple key value store. The exercise will list the expected module interface, with functions like new/0, find/2 or delete/2. In that scenario a simple test function may be too limited. This is what I do:

It’s almost the same as before but I’m using catch multiple times inside a list now. That way, students can detect multiple test errors (one per test case) when they evaluate db:test().

Whole Systems

Well, all of the above is very nice for basic functions and simple modules, but what happens when what you have to build is a big system. In those cases, I tend to apply a very similar approach. Most of the systems I build communicate themselves with the external world through a RESTful API. Then, the specifications on how the system should work are defined in terms of that API. Since I use my tests to specify how my system should work, I write tests that hit that RESTful API and verify its proper behavior. You can see an example below:

As you can see above, what I’m doing is using spts_test_utils:api_call(…), which internally uses an http client, to hit the endpoints my server is supposed to provide. Then, I’m pattern-matching on the responses.

As you can see in the project where that code came from, I do that for RESTful APIs, binary APIs and even internal APIs.

After writing each test suite (sometimes just after writing each individual test case), I write the code needed for that test to pass, in the same TDD fashion I recommended before.

Once the whole system is built, I run all the test suites with a coverage report and if the coverage is below 100%, I write the missing tests to raise it up. Given the unavoidable boilerplate that OTP sometimes requires, I generally end up with stuff like the one below:

Why do I do that? Well, Marcelo Gornstein once told me a great truth:

If you don’t have 100% coverage consistently, when you see a lower coverage number after implementing a new feature you’ll never be sure (just by looking at the number) if the lines that are not covered where introduced by you or where already uncovered before.

Always maintaining 100% coverage gives me peace of mind.

Meta Testing

Finally, following the same logic as Zach in his article, I also like to use all the tools that Erlang, OTP and the community provide to ensure code quality. In particular, I have all my projects in GitHub reviewed by Elvis and/or Gadget. That ensures that every single PR is checked with the Erlang compiler, dialyzer, elvis and xref.

But another thing I learned from Hernán, is that you can include all those things directly in your tests. That’s what he calls Meta Testing. InakaESI created a super-simple tool (katana_test) that allows you to add Meta Testing to your Erlang projects in a few lines of code. With katana_test I get dialyzer, xref and elvis directly in my test suites and they check every single change that I make to my code. Isn’t that cool?

But it is too slow!

Well… maybe not that cool some times. This way of working comes with a dark side, too. TDD forces me to run the tests a lot. When I’m developing software, I actually spend most of my time either writing or running tests.

Testing whole systems black-box style as I recommended above (i.e. hitting APIs as if the test was the client instead of testing internal functions in unit-test style) makes my tests considerably slower. Meta Testing is not very fast either (especially the first time, when dialyzer creates the plt).

You can see how those two things are conflicting, right?

My usual solution to this problem is to choose my suites wisely:

  • I run all the suites and check the coverage report before I start development, to make sure that everything is working and everything is tested.
  • From then on, I only run the test suites I edit or create, and some times the meta suite as well. Both rebar3 and have very simple ways to skip the suites you don’t need to run.
  • Once I’m ready to push my code, I run the whole thing again with coverage and all to make sure I didn’t accidentally break something and I have covered all the lines that I inserted with tests.

This is certainly not faster than not running tests at all, of course. It’s also slower than just writing and running unit tests. But on the other hand, this way of work gives me lots of piece of mind and that is priceless.