JavaScript Test-Runners Benchmark

(Part 1, The Unit Testing)

Vitaliy Potapov
Jun 27, 2017 · 8 min read
Photo by Braden Collum

Performance is an important criteria when choosing test-runner. Tests should pass as fast as possible to detect errors earlier, improve developer experience and reduce CI servers running time. In this story I will compare the most popular JavaScript test-runners on the same set of unit tests and find the winners.

Candidates

I will not go deep into the specific features of test-runners. Each one has many pros and cons. For detailed comparison of features I can suggest this awesome Overview of JavaScript Testing in 2017 by Vitalik Zaidman. Here I will talk only about cold-run execution time. I will use all test-runners in out-of-box setup and toggle only a few common options that I will describe later.

So, please welcome the candidates for today’s performance competition:

Preparation

Before shooting from a starter pistol let’s discuss the rules. To make a comparison fair enough I should apply each runner to the same set of tests and with the same running options. But it’s not possible because each runner has own format of tests and own running options. Therefore I will perform some unifications.

Generating test-files

For tests itself I’ve defined 3 common logical parameters:

  • synchronous / asynchronous (constant delay, random delay)
  • with nested suites / without nested suites
  • with before|after|beforeEach|afterEach hooks / without hooks

Every combination of these parameters is used to generate test-files. For example, tests without and look like this:

describe('suite 0', function () { 
it('test 0 0', function () {});
it('test 0 1', function () {});
it('test 0 2', function () {});
it('test 0 3', function () {});
it('test 0 4', function () {});
});

and tests with look like this:

describe('suite 0', function () { 
it('test 0 0', function (done) { setTimeout(done, 3); });
it('test 0 1', function (done) { setTimeout(done, 0); });
it('test 0 2', function (done) { setTimeout(done, 8); });
it('test 0 3', function (done) { setTimeout(done, 2); });
it('test 0 4', function (done) { setTimeout(done, 5); });
});

Actually all tests do not perform any assertions. It allows to measure own test-runner performance cost.

Split per runner

Moreover, all test-files are generated for each test-runner in individual format. For example, in file looks like:

describe('suite 0', function () { 
it('test 0 0', function () {});
it('test 0 1', function () {});
it('test 0 2', function () {});
it('test 0 3', function () {});
it('test 0 4', function () {});
});

But the same file for is:

import test from 'ava'; 

test('test 0 0', () => {});
test('test 0 1', () => {});
test('test 0 2', () => {});
test('test 0 3', () => {});
test('test 0 4', () => {});

Runner options

Each runner has different CLI options. I’ve separated 3 common options that exist in most of runners and affect the performance:

  • serial execution / parallel execution
  • for parallel execution — number of concurrent workers
  • with Babel transpiling / without Babel

For example, to start with 4 parallel workers:

tap test.js --jobs=4

although for the same command is:

jest test.js --maxWorkers=4

Referee’s stopwatch

For measuring execution time I will use bash time command. I will apply it to each runner CLI call and take the . For example, to measure :

> time mocha %path_to_tests%real    0m0.282s
user 0m0.265s
sys 0m0.043s

To call different runners with different options I’ve automated the benchmarking process. Using shelljs library I’ve arranged a racing loop: it iterates all runners, calls corresponding CLI command and saves the times from output:

runners.forEach(runner => {
const cmd = `time ${runner} ${testsPath}`;
const proc = shelljs.exec(cmd, {silent: true});
const matches = proc.stderr.match(/real\s+([0-9]+)m([0-9\.]+)s/);
const minutes = parseInt(matches[1], 10);
const seconds = parseFloat(matches[2]);
const totalTime = minutes * 60 + seconds;
results.push({runner, totalTime});
});

At the end I get array of execution time for each runner. These data allows to build a chart.

Charts

To make a visual representation of the results I will use bar charts. Chart.js library is a great tool for that.

Environment and equipment

I will run all benchmarks on Node.js 7.2 on my MacBook Pro 2,6 GHz Intel Core i5 (4 cpus, OS X El Capitan). All runners are the latest versions installed from npm at the time of writing.

RUNNER               VERSION
mocha 3.4.2
mocha.parallel 0.15.2
mocha-parallel-tests 1.2.9
jasmine 2.6.0
tape 4.6.3
qunit 2.3.3
lab 13.1.0
tap 10.3.3
jest 20.0.4
ava 0.19.1

For sure, if you run benchmark on your machine — you will get different absolute values. But relative results should be similar. If you get quite different picture feel free to comment.

Ready, Steady, Go!

The competition consists of two major groups:

  • synchronous tests
  • asynchronous tests

Synchronous tests

Actually such tests are empty functions:

it('test', function() {});

Let’s start with the simplest case — synchronous tests with no nested suites and no hooks. All runners can run such tests. The final chart is:

There are top 7 runners with very close time within 0.5 second. is the leader with 215 ms. And the other 3 runners (, and ) are several times slower. I guess the reason of slowness is that both and are doing Babel transpiling by default. It takes considerable time.

Let’s do the next run where all participants will perform Babel transpiling. Some runners are excluded as they do not support it out-of-box:

Execution time expectedly increased for all runners. And the leadership was taken by . But still 3x–5x slower than others (even when I set concurrency=4 that equals to 4 cores of my machine as suggested here).

I decided to arrange individual run for with different concurrency on the same 50 test-files:

The fastest result is again about ~9 seconds. I’ve looked at several performance related issues in repository. It seems the main “time-eater” is the forking of Node.js process.
Also the chart shows that default run of is not optimal — it does not set default concurrency. For the best performance you should set it manually depending on your cpus count.

The last run for synchronous tests is with nested suites and hooks. Some people love nested suites and some consider it redundant for unit testing. Some runners support it and some do not. Such benchmark should help to choose runner if you are from “nested-suites-camp”. Each test-file contains 2 nested suites with 5 tests per suite — totally 50 files, 500 tests:

describe('suite 0', function () { 
describe('nested suite 0', function () {
// 5 tests
});
describe('nested suite 1', function () {
// 5 tests
});
});

The result:

Basically nesting of suites does not have a big performance impact. The result is very similar to the first chart — , and are the leaders.

Asynchronous tests

Each asynchronous test is empty function wrapped into setTimeout:

it('test', function(done) { setTimeout(done, 0); });

Asynchronous tests can be executed by runner in parallel due to async nature of Node.js. But not all runners support it. For example, does not support parallel test execution. That’s why I’ve included and into the benchmark: they wrap original and allow to run tests in parallel.

The first run is all tests with zero delay and no suites / no hooks:

is the fastest again. and look very good, but not too far from original . The picture is similar to synchronous case except became significantly slower.

But real-life tests do not have constant zero delay — actually they take some time. To emulate that I could simply insert Math.random() delay into each test. But this is incorrect approach: random values will differ from runner to runner and the benchmark will not be fair. For true emulation I’ve pre-generated the sequence of random numbers in range of 0–10. Then I used these numbers as delays in test-files for each runner. This ensures the identity of benchmarks.

Let’s look at the results:

Here is the winner. It is 2x faster than the nearest and 4x faster than . Other runners lined up in similar order as in previous zero delay run.

And the final loop. Keep the random delay but with nested suites and hooks:

The result is very interesting! executed the job within a second while other runners took 10 seconds, 18 seconds and 61 seconds. To be sure I’ve re-checked several times — the result is persistent. definitely deserves attention.
Also this is the first case where is faster than and Despite the fact that performs Babel transpiling. It proves that parallelization and utilizing all cpus are very important for testing real asynchronous code.

The finish

While runners are taking a breath I’ll make a short conclusions:

  1. If you need just fast and simple runner for synchronous unit-testing — the one of time-proven / /anda bit younger is a good choice.
  2. For asynchronous tests you may consider wrappers like and . also shows pretty results.
  3. Trendy and are slower in common. But in contrast they suggest many additional features and can significantly improve your testing experience.
  4. The benchmark itself is also the tool that can be improved. Each runner offers some additional options for performance tuning. If you know how to boost it— feel free to share. I’ve published all the benchmark code on GitHub. You can play with it locally and build your own charts. I believe it will help other developers to make the right choice and save more testing time. As the time is one of the main treasures in our life!

Thanks for reading and happy testing!

DailyJS

JavaScript news and opinion.

Vitaliy Potapov

Written by

Web developer at Yandex

DailyJS

DailyJS

JavaScript news and opinion.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade