Maps vs Pattern Matching in Elixir

Some time ago, while introducing Elixir at Cabify, we did a workshop where we picked some simple problems and implemented them to get a better grasp on Elixir.

Today, I came back to one of these mini problems. The goal was calculating the “age” of a person in different planets using a conversion factor… a pretty simple problem.

But, at the time, we built two solutions… one based on maps and other using pattern matching. I also added some benchmarking code.

Using a map as a lookup table

Coming from ruby, our default was to build a lookup table (planetfactor).

Looking up entries in a hash table is a fast operation. But one very powerful tool of Elixir is pattern matching… how would it compare against it?

Using pattern matching

So I built this other version using multiple function bodies instead of looking up the factors:

So we fix the factors on the code at compile time instead of looking them up at runtime, and we let Elixir and the BEAM optimize the matching for us.

How do they compare?

Apart from style, which would be quite subjective (in our case, people less used to Elixir preferred the map version, while more experienced ones liked pattern matching) we can compare both versions by their performance.

I used Benchee for that. After updating elixir, erlang and benchee, I run my old benchmak.

The result is the same as last year: the pattern matching implementation is faster.

Note that Benchee issues some warnings because we are testing very fast functions, so there will be some inaccuracies (I wouldn’t swear by the 1.19x slower result) but after many runs using a Map is consistently slower

But why?

The Erlang Efficiency Guide provides some light on how pattern matching works.

In an ideal situation, pattern matching on function heads uses a binary tree whose nodes have been re-arranged by the compiler (so performance is much better than a linear scan that I initially assumed).

Still, shouldn’t a lookup table be faster than walking a tree?

The elephant in the room is the square bracket syntax used to access map elements: @planet_factors[planet]

Square brackets use the Access protocol, so involves extra costs. Performance analysis of different access methods to maps and keyword lists is in this great Medium post.

After replacing it with the fastest option (pattern matching on the map to extract the conversion factor: %{^planet => factor} = @planet_factors), the results are indistinguishable performance-wise and we are back to the preferences in code style.

Bonus track, generating patterns from data files / configuration

Those with a trained eye may have noticed a with patterns and macros entry on the Benchee results.

This is an experiment for squeezing more performance on some situations where we have the lookup table defined outside of code (like yaml or csv files).

It may seem that in this case we are stuck with using a map, but we can also use macros to parse the input and generate the functions at compile time (or to convert a map into functions)

The code is not beautiful and the use case contrived, but serves as a proof of concept (in fact, this technique is used to implement part of Elixir UTF support).

For some reason I have not yet figured out

Conclusion

Maps and pattern matching can achieve similar speed if we pay attention to how we use our maps. We can also compile maps into functions, even loading the lookup table from an external file during compilation phase.

This leaves the decision of what tool to use to you. Isn’t that great?

Personally, I find pattern matching suits me better.

Resources

The source code for this benchmark is available on GitHub.