Solving Flaky React Unit Tests

Bhavin Agarwal
SAFE Engineering
Published in
7 min readMar 30, 2022
Image Credits: https://burst.shopify.com/

It was a Friday morning and we have just logged in to our work. People were doing their regular checkins, and suddenly, we saw Jenkins failing on unit tests. The logs said it was a timeout on one of the tests but when ran locally, they were fine 😕

It was then that we realized that some flakiness exist in our test suite and it was more evident as we kept on adding more tests, alarming us that something was wrong with how we were writing our UTs.

We then went through our whole react test suite and analyzed some general patterns that we were either overdoing or not even doing properly 😞.

Below we talk about our realisations on what to actually test in a unit test, the general pattern for testing we then followed, some common precautions we took while using jest with react-testing-library and some tips that helped us reduce the flakiness in our unit test suite

Let us surf right through them 🏄!

What to Test?

Unit Tests are an integral part of a codebase. They help us gain confidence in the code we write but, do they give us any confidence in how our customers will use them? This is a bit tricky to answer 😕

As per Kent C. Dodds:

Firstly, test that piece of the feature that you would cry on the most if broken.

And this is true! Sometimes, we spent so much time on testing stuff that our users won’t even know upfront or simply don’t care about. Such implementations usually lead to tests missing on the most used functionality of our code.

So, aligning with what Kent said, test out things, that might make our users abandon our work if gone wrong.

How to Test?

We started following a widely used pattern of Arrange Act Assert (AAA) for our unit tests. In short what it says is:

  1. Arrange: Set up the data for the function or component under test. In addition to this definition, we can also specify the mocks to lay down the action ground for Act and Assert phases.
  2. Act: Invoke the function/hit a server request or render a component with the data prepared in step one.
  3. Assert: Assert on the called function/hit URL/rendered component under test.

The Dont’s

While we have some general ways of testing things out in jest and RTL, sometimes, we may utilize them in the wrong way which can make our tests flaky. Though they might still run and even pass, we generally see issues when they run in a CI environment having limited resources. Below are some precautions which we realised while solving for the flaky behaviour, especially using React-Testing-Library.

Precaution 1: Try to run your tests in an isolated environment

While debugging the flaky tests, we found that some tests were dependent on other services and if any of them were failing, the tests also failed. Thus, it is very important to:

  • Mock external services inside our unit tests
  • Make sure that when tests are running, no other service is running.

In our case, we used to start all the docker containers and then used to run unit tests in the CI environment but right after flaky behaviours, we added a separate stage in our Jenkins pipeline that runs unit tests before running any service.

Precaution 2: Don’t mock useSelector

We sometimes mock useSelector() of Redux-React based components to get a data-driven render without Redux playing any role. This intervenes in the way how react-redux actually renders a component and thus making our confidence low. Instead, we should render our component with a pre-populated state like below.

The wrong way 🚫:

The correct way ✅:

Precaution 3: Don’t make user events inside waitFor

waitFor is used to wait for an assertion that might take an indefinite amount of time. Adding up user events inside a waitFor doesn’t make sense. Firing user events inside a waitFor don’t wait for that user event to happen. So, we should use it only for assertions.

The wrong way 🚫:

The correct way ✅:

Another problem with keeping user events in waitFor is that it might run multiple times till the assertion becomes true inside waitFor and thus, fire user events multiple times! This is certainly what we don’t want 😕

Precaution 4: Don’t use one component for all the tests

Do not use a single instance of a component for all of your tests. The state updates done on the component in one test are carried forward to further tests.

The wrong way 🚫:

The correct way ✅:

Precaution 5: Don’t make your test async if no async events happening

We are so used to adding or (copy-pasting 😛) tests that sometimes, we care the least about how the initial structure for the test was written. One of the occurrences is using the keyword async in the test even when there is no async event happening 👀. Try to have a close look at these events and decide wisely whether to use an async in front of your test.

The wrong way 🚫:

The correct way ✅:

Precaution 6: Don’t wrap render in act

React testing library in itself wraps the render function inside an act. We should NOT wrap a render inside another act.

The wrong way 🚫:

The correct way ✅:

Precaution 7: Don’t use a separate container object to render the component in tests:

Most of the UTs in our code used a separate element called container which was nothing but just a div the element where we mounted our React component to render in our tests.

This is not at all required. RTL creates this element for us and renders it automatically when we use a render from RTL.

“But wait, how do we make our selector queries (getByTestId(), getByText() ) work? I have seen they require a container element as the first argument (getByText(container!, “some-text“)). No 🤔?” — You might say

We actually have a handy screen object which we can use to replicate this behaviour. The best part is, it is always updated to the latest state changes present in the DOM and thus, does not require any rerender 😃

The wrong way 🚫:

The correct way ✅:

Taking up the above precautions helped us solve the flaky behaviour to a great extent.

While solving for the same, our team got involved in some discussions around other topics as well which we discuss below.

Some General Tips

1. Don’t take snapshots for granted

This has been a point of debate among many if we should use snapshots or not? First, let us list down some pros and cons:

Pros:

  1. Every detail of the component is captured.
  2. Get to know the impact of a component on other components.
  3. Easy to update.

Cons:

  1. Require frequent updates (fragile).
  2. Get easily missed in code reviews.
  3. Adds up to long code.

Looking at the points above, here is what is suggested by many on when to use snapshot testing. We should not use snapshot testing when:

  1. The component is long and complex: If you got a component that is either a very long code of HTML/JSX or involves a lot of complex external business logic, we should not use a snapshot.
  2. The component does change too much: If we have a component that does change much in terms of UI aspects, we should not include a snapshot test for the same.

Note: Since snapshots ultimately become a part of our codebase, it’s very important to review/update them like any other function in our code review process. Remember, even for simple components, sometimes, it may occur naturally that a snapshot test would be enough. But, it’s very important to analyze what that snapshot contains — Does it include only loaders and not the actual data? Is it rendering the correct colours?

2. Render only the required component:

Sometimes, we tend to test a component that is used in conjunction with some other component (say a child component). It’s recommended to test only what we need to test. This would help make our tests small, would require less mocking and in all, are easier to maintain 🙂.

If we ever get into a situation where we think some kind of arrangement is required for that component to render, we prepare that arrangement inside the test itself!

For example, do we want to test out the Modal component that needs some ref to render? Create a new arrangement component inside the test itself.

3. Chrome extension for handy RTL queries:

Testing Playground chrome extension is really good which gives you query suggestions for your component right in the chrome. So cool 😃!

They are sometimes tough to deal with but I have found myself using them the first thing if I am confused on how to get something from a component.

The testing playground chrome extension

Note: You may find yourself writing better queries than suggested by the above extension. Use wisely 🙂

End note

There can be many more ways by which we induce flakiness in our tests like improper mocks of external services, not resetting mocks in tests, etc. Debugging flaky tests can take a huge toll on engineering as dealing with them means, stepping into an unknown land where you don’t know how to make it happen again 😫. These become more fragile as different teams keep adding more tests and thus, pilling up the flaky behaviour if not taken care of at the early stage. We should review them as dearly as we review the code they are testing 🙂

Hope you found these precautions and suggestions helpful! Feel free to comment on more cases that you might have encountered 🙂

--

--