Jest: Our journey into performant unit tests

Introduction

We recently migrated all of our unit tests from AVA to Jest. Our tests are 11 (eleven) times faster now that we are done. We learned a lot during the process, and there are some things we would have done differently.

Do you feel your unit tests could be more performant? Do you feel your unit tests are not really testing units? Are you just curious about how we did it? Then you are in the right place.

Start with why

First of all, we had accumulated quite some technical debt in our tests. This is 100% on us, and we are not going to blame AVA or praise Jest for solving it. There were several AVA features we were not leveraging correctly, several tests were over-bloated, inconsistent implementations of same testing helpers, etc. This had a huge impact on the performance of our tests. A fresh look was something we were longing for.

We wanted to start doing mutation testing. Stryker has better support for Jest (jest-runner) than AVA (there are still ways to do it) out of the box. In a nutshell:

Bugs, or mutants, are automatically inserted into your production code. Your tests are ran for each mutant. If your tests fail then the mutant is killed. If your tests passed, the mutant survived. The higher the percentage of mutants killed, the more effective your tests are.

Last but not least, we felt there were better tools than the ones we were using, and the Jest ecosystem is growing a lot.


(Unit testing)

I want to do a small stop here to introduce a few concepts, and to understand what we mean with unit testing. If we look it up in Wikipedia, we will find this particular bit:

Ideally, each test case is independent from the others. Substitutes such as method stubs, mock objects, fakes, and test harnesses can be used to assist testing a module in isolation. Unit tests are typically written and run by software developers to ensure that code meets its design and behaves as intended.

We want our tests to be independent. In order to do that we are going to follow 3 basic principles:

  1. Mock all the things: if there is a dependency, mock it. Always favor isolation.
  2. Write once, configure many: if you create a mock of a module inside a test, make it configurable and move it to a shared place (more on this later).
  3. Understand your mock API: there are different actions that can be performed on mocks (see how Jest defines mockClear , mockReset and mockRestore). There are also different ways of writing and generating mocks (see manual mocks).

Understand how the library you are using works. What is happening when you mock? This might seem trivial, but it is not.


The how — process and tools 🔧

Big bang ☄️

We took a controversial, yet key, decision at the beginning of the migration: “we are going to merge this in one go”. Full big bang. No pull requests getting merged gradually into master.

You might think we are crazy. All that risk into a massive PR? What if it fails Who will code review it? We were confident enough to know that we would be in a better place after the migration, and we wanted to finish as fast as possible.

There is one clear benefit after making this decision: you don’t have 2 systems doing the same thing, at the same time. How do you calculate coverage with 2 test runs? We did not want to find out.

The main issue we found with this approach was conflict resolution, as we were modifying EVERY test file. Git conflicts were guaranteed, but we trusted our git skills.

Conventions 🛑

  • No mocks are automatically reset/restored: if you are faking behavior, be explicit about it when you set it up and tear it down. We clear our mocks after every run.
    This reduced the “magic” factor when running tests for us.
  • No external test helpers: avoid creating test abstractions that are needed just for tests. If we are repeating something a lot, let’s take a closer look, but don’t rush the test/helpers/**/*.js.
    This made our tests very declarative, we don’t need to check another file to see what a test does.
  • No console output: you should not output to the console during tests. Make sure you are not outputting prop-type warnings or something similar. These are hard to remove when you have more than one test running in parallel.
    This allowed us to spot such warnings and fix them. Also, less noise.

The what — show me some code plz ⌨️

You’ve made it this far without a single line of code. Let’s change that.

There are three major types of files throughout our codebase.

  1. Javascript functions or modules: normal javascript functions/classes/services, whatever you write in plain JS.
  2. React components: the stuff you use to implement the MDN header 😬.
  3. MST stores: for state management, flow control, etc.
Tip: Are you using redux? Interested in what MST is? Check out The Curious Case of Mobx State Tree

Take a look at how we ended up testing each of these types by following what we explained before.

Javascript functions or modules λ

Not much to be added here. As long as you correctly mock your dependencies and follow our/your conventions, you should be good to go. Never used test.each before? Take a look

In this example, we are not stubbing the return value of getString. That is because we don’t really care about the actual value of a string. We just need them to be the same (by spec). getString works and its behavior is asserted in another unit test.

React components ⚛️

Our tests for React components are pretty similar across the board. We have very little business or any other logic in components itself, so they are easy to test and are only concerned about rendering & calling back. We use Enzyme to test their output. We don’t mount our components because, as we are unit testing, we don’t want to render the full tree (as of Enzyme v3, shallow rendering calls componentDidMount lifecycle hook, so we can still cover that).

Let’s take the following example of a Page component. It wraps the given body of the page (in the form of children) between a header and some footer links. We also expose TEST_IDS constants to be used in unit and other types of tests.

In many of our component tests you will find a getWrapper function, which provides a way to get the component that is currently being tested with some default props and the ability to override them. When you need a certain prop value for your test, you pass it along. When you don’t need anything in particular, just get something to work with (write once, configure many).

There are several ways of achieving similar coverage. Some people prefer to import specific components they want to assert are rendered. We find it enough to snapshot test these situations.

In order to simulate click, change or other events, we use Enzyme’s simulate API a lot.

Tip: Using React, Enzyme & Jest? Consider using enzyme-matchers, which allow you to write more expressive assertions.

MST Stores 🥤

This was the most important part of the migration for us. Our MST stores hold all of the business logic and control every React component (directly or indirectly) being rendered. We needed to do this one right.

All of our stores are created in the following way:

Key difference in these tests: when we test MST stores, there isn’t a direct import to the file being tested. Weird, right?

We do this because we want to share our mocks throughout our tests. We don’t want to know how to create and configure our dependencies. I don’t know how to stub every method of my dependencies that might get called. So we always import a file from a mocks folder, but the exported default is not a mock…

We set up our UserStore with mocked dependencies and export it for our unit test

This is the file that we will be testing. It is importing the implementation of the UserStore but we don’t import it from the test. The store is then created with its environment as Jest mocks. We also export a way to create a store with a given initial state (write once, configure many).

Apart from the configuration of the store itself, we also create a barebones manual mock for each of our stores-with-mocked-dependencies (the API is also modeled as an MST store). We just export the mocked API our tests will need to override behavior for:

We create a UserStore mock for others to use

One could say this file structure is the only “helper” (or indirection) one needs to understand. Once you grasp this, you can test your stores like this:

We reference our mocks directly from the unit test. We don’t import the implementation directly.

And there we go. api is already mocked and our store holds the correct reference. Same goes with errorStore , so we can just stub behavior and check that the code does what we need. These mocks are cleared after every test.

The test is expressive enough to understand what is going on. This makes it easier to find culprits for failing tests 🎉.

Outcome

In numbers 💯

  • 5 people from 2 teams were involved
  • Just 1 sprint​ 🏅
  • 1 pull request
  • 157 test suites​
  • 958 tests​
  • 22k lines rewritten​
  • More than 11 times faster: 340s to 30s​
  • Line coverage 93.76%​

As you can see it was a tremendous effort that involved a LOT of team collaboration and knowledge sharing.

In words ✍️

  • Feeling of ownership: we now feel way more confident with our codebase. We have a very good feeling of ownership of our own code. We understand it, we have a plan for it.
  • Team bonding: this migration involved a lot of people. Although 5 people were the core collaborators for the migration, multiple people were onboarded during the process, because they wanted to help out. What did that do to the team? We bonded a lot during those 2 weeks, and we shared a lot thanks to the collaboration.
  • Thanks to Jest’s ecosystem and built-in features, we got rid of dependencies like sinon, timeout-promisify, enzyme-json, nyc and some others. We believe in using the right tool for the right job, don’t take us wrong. In this case, these dependencies were there to “make things work”. We did not take the time needed to understand them properly, which ended up in working around them rather than leveraging them.

Technical improvements 👩‍💻

Some of the improvements we are now benefitting from came strictly because of the fact that we were rewriting our whole suite. We could take a fresh look into our WHOLE test codebase and change what we did not like.

A couple of things we fixed along the process that made a difference:

  • Timezone agnostic: before this, our test script looked something like this "test": "TZ=UTC ava --config file.whatever”. We hated that TZ=UTC. We are doing very little with dates and times, so this was the first to go. See date-fns/issues/571 for more.
  • No downloading: remember how I previously said we don’t care about the actual values of strings? Well, we used to have that by downloading the latest version of strings before every run… Yeah, we know.
  • Speed: +10x 🏎💨

Time to say bye

But first, let me share some 🔑 learnings with you.

What would we do differently? 🤷‍♂️

Our biggest struggle was towards the end of the migration, and we could have mitigated by doing the following:

  • Earlier feedback on coverage: we did not check for coverage for individual files we were migrating or testing. We should have had this check before every commit or similar.
  • Migrate by type of file (services, components & stores): each of us grabbed a particular folder or feature to migrate at a time. This was very easy to coordinate, but delayed the acquired knowledge until the very end. Why? Because we did not finish testing one type of file and learning how to do it best before starting to test another type. Also, some problems about particular approaches arose by the end of the migration (more on the next section).

Things to look out for and how we solved them 🐛

  • Memory leak: this was a pretty tough one. The last MST store test that we migrated was taking around 20 seconds to finish, and sometimes node aborted the run after running out of memory. The workaround for us was to use manual mocks instead of generating an automatic one based on the exported store instance. If you are using MST, be careful with this. We isolated the issue as best as we could in jest/issues/8112.
  • Order of imports: if you go back at the way we mock stores in order to test them, you will notice there are several jest.mock('...') calls in different imported files. The order of the imports affected which was the store that was holding the mock references. Our advice is to check how jest.mock works to understand this.
  • window.location, Date & friends: these are always tricky. Different projects handle these differently as mocks. See how we ended up configuring Jest ourselves here.
  • Async error handling: if you have async tests that have a try/catch block where you are asserting some error handling, make sure you call expect.hasAssertions(). Else, the test will pass even though you don’t run the expectations in your catch block.

Last piece of advice to other teams 🚀

  • Don’t be scared of big tasks, sometimes 1 sprint is all it takes.
  • Trust in strong collaboration.
  • Don’t be tempted to fix bugs within the migration: it will just be another merge pain, and you want that fix in “master”.
  • Merge your “master” often. Several small conflicts are better than a huge one.
  • Share tips of how you are solving something constantly: communication is key. Share tips. Validate your understanding of how something works. Leave comments in the PR. Always.
  • Try to rethink your tests, codemods won’t save you: at least we did not find a “silver bullet” that migrated our tests AVA -> Jest by running a script.
  • This is the time to try stuff that might give you big returns: do you have an idea that might improve your testing experience? Timebox it to a couple of hours. Try it out. Maybe you find a golden nugget 🏆.
  • Have a clear timeline to deliver, prevent scope creep as much as possible: timebox your migration. Define what you expect to deliver by the end.
  • Have fun testing! I hope you learn as much as we did.

👋