Benchmarking Elixir Benchmarking Tools

When it comes to programming, benchmarking is not always necessary. In other words, it’s not crucial to always write the most efficient algorithm, but that doesn’t stop programmers pondering which approach is the most efficient and whether, in fact, efficiency by itself is enough anyway.

Surely efficiency is more a deductible variable when faced with two comparable results.

That said then, it is fair to say that the main purpose of a benchmarking tool or library isn’t simply to provide you with numbers, it should give you useful numbers which make sense or lend themselves to further evaluation.

At its most basic level of functionality, a benchmarking library should independently collect data about your different solutions and deliver a comparative overview of these results. The report should be an accurate, user-friendly and well-documented interpretation of the data it has collected.

These are both pretty basic requirements and it’s the features above and beyond these two foundation blocks that make a good library, great.

Simplicity to learn, use and setup are really desirable when you’re first getting to grips with any new tool, and then as time passes, formattable outputs and configurability become important too.

Let’s look at some of the well-known benchmarking libraries available for Elixir, focusing on their pros and cons rather than the algorithms being tested.

Input

For the purpose of this post, I have considered three different approaches on the map function to be benchmarked against each other: a tail recursive, a body recursive and finally the map function from the Elixir standard library.

The suggested input was taken from this presentation (p75–76), given during Elixirlive in Warsaw (there’s avideo too!).

You can find the code needed for the tail and body recursive map functions in this gist. And Enum.map is what we will use as the standard library reference for benchmarking.

You can find the full project under following github repository

Benchee

Link to the repository: https://github.com/PragTob/benchee

Benchee is a library for micro benchmarking in Elixir. It’s capable of benchmarking the performance of your specific functions and it provides the output both in the terminal and in a file. Generally speaking, the setup is simple and the output is substantial — measuring performance in Iteration Per Second (IPS) and offering a decent range of configuration options.

The method of measurement is based on the number of executions, which by default is five per provided function, but this can be changed in the configuration settings.

Pros

  • It can be extended with plugins. Some existing plugins can be found here and here
  • Extensive reports on execution
  • IPS
  • Average
  • Deviation
  • Median
  • Comparisons between the provided functions
  • Minimal setup
  • Many configuration parameters to customise the execution
  • Simple handling of input parameter for benchmarking functions
  • Well documented
  • Two modes: parallel and sequence
  • Open source

Cons

  • Lack of steps like setup and teardown
  • Lack of support for mem usage
  • Lack of info on parallel execution option
  • Lack of an internal compare tool

Sample code

Here’s a link for the benchmarking implementation for Benchee :https://gist.github.com/ghatighorias/2332ddbb0d40cfdda95c80f780341858

Sample Output

Here’s a link for the output of the benchmarking:https://gist.github.com/ghatighorias/625dc64b696f0666d5cec6112d4f80fe

Benchfella

Link to the repository: https://github.com/alco/benchfella

Benchfella is a library for micro benchmarking in Elixir and it’s written in a similar way to ExUnit. It’s capable of benchmarking the performance of your specific functions and provides the output both in the terminal and in a file.

Like Benchee, Benchfella also measures the performance in IPS, but it’s fair to say that the method of execution varies. While Benchee executes and then measures the performance based on elapsed time for the requested number of executions, Benchfella does so by the number of iteration that can be performed in a given time. Because Benchfella uses a different method of benchmarking, the results might seem unusual at first glance.

Unlike Benchee, Benchfella is a mix task. Benchee is a simple library and you should decide how to run it, but for Benchfella you’ll need to run bench as a mix task — it needs a specific folder structure too, so remember to create a folder named bench under the main folder structure of your mix application and put the benchmark files in there.

Make a mistake in the folder structure and the benchmark will not run. Remember also, that Benchfella saves each execution result in a snapshots folder which will be created automatically.

Benchfella’s output, unlike Benchee, does not provide median and deviation which makes it hard to estimate the range of results, so be prepared for observing notable changes between execution results; particularly with methods which are heavily system dependent, like reading from or writing to files. In such situations, your method is additionally dependent on how busy your OS is when the read/write file method is triggered.

This is when calculating median and deviation really comes into its own. It’s worth mentioning that median and deviation are easy enough to calculate, but designing a specific test and functionality to calculate, format and print it is another matter altogether.

Pros

  • Saves the benchmark snapshot
  • Has an internal comparison tool for different executions
  • Implementation format similar to unit tests in Elixir
  • Creates a tendency not to keep the benchmark in the actual project code
  • Internal graph generator for benchmarks
  • Open source

Cons

  • Not well documented
  • Output is not well described
  • Lack of extensive report on output
  • Rigid folder structure and naming
  • Not so simple to extend

Sample code

Here’s a link for the benchmarking implementation for Benchee :https://gist.github.com/ghatighorias/c16e965eb3e384630e7284478a84ab60

Sample Output

Here’s a link for the output of the benchmarking:

Bmark

Link to the repository: https://github.com/joekain/bmark

Bmark is a library for micro benchmarking in Elixir that also resembles ExUnit. It’s capable of benchmarking the performance of your specific functions and provides the output in files.

Compared to Benchee and Benchfella, Bmark provides a much less advanced functionality. The general structure of writing benchmarks using Bmark is close to Benchfella with minor differences in naming regulation. But, unlike Benchfella, some other utility functionalities, like the setup_all macro, are not available.

Results from Bmark execution won’t be printed out. They will be saved under a folder named result in the main directory of the mix program. Results of execution for each function of a testing module will be captured and saved in a separate file and unfortunately, the snapshot file is not very descriptive unless you use the built-in comparison tool.

The integrated comparison functionality has its limitation too. It’s capable of comparing only two results, which makes it really difficult when benchmarking several functions with different input sizes. Over the Bmark comparing tool still doesn’t tell us a lot about the execution result and the output is rather cryptic.

Pros

  • Saving benchmarking snapshot
  • Having an internal comparison tool for different executions
  • Implementation format similar to unit test in Elixir
  • Provides tendency not to keep the benchmark in the actual project code
  • Internal graph generator for benchmarks
  • Open source

Cons

  • Not well documented
  • Output is very poor — basically a list of numbers
  • Lack of extensive report on output
  • Rigid folder structure and naming
  • Saves files under the main folder branch
  • Not so simple to extend
  • Does not follow the best parts of the ExUnit

Sample code

Here’s a link for the benchmarking implementation for Benchee :https://gist.github.com/ghatighorias/5470bb8fb33552c3f598462521e79c37

Sample output

Here’s a link for the saved snapshot of one run:

Here’s a link for the output of the integrated comparison tool ran on two different snapshot

Please note that Bmark doesn’t provide any output in the console, and directly save them as snapshots of execution.

Conclusion

So, which one of these libraries is my favourite? I choose Benchee. It has all the pros that can make up a good benchmarking tool great (extensibility and extensive output) and a whole lot more. Some of the cons are already raised as issues in the Github repository so there’s a good chance of improvement thanks to an actively engaged community.

Like what you read? Give Digital Natives a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.