How we fixed our flaky unit test suite

Published in

Qualyteam Engineering

6 min readJan 14, 2019

There’s no agile without automated tests. Still, bad automated tests can slow down your product delivery.

Unit tests should help you with code refactoring and delivering features, ensuring your use cases work without introducing new bugs.

We’ve struggled years with annoying unit tests that would break even with small refactoring. They were hard to maintain and weren’t readable . And even worse, they wouldn’t ensure our app works.

And we fixed it.

We’ll show you the problems we identified and how we solved them with research and development.

Fixtures

The state before

Have you ever found yourself in a situation where you just wanted to test the ‘Apply for a job listing` use case, but you needed the applicant, its past experiences, its skills, the employer, and the job? The data required to run a test case is called test fixture. Complex test fixtures may be hard to read, which leads to developers just copying and pasting them through the tests, sometimes creating unnecessary data for a given test. The abuse of the new keyword breaks object instantiation encapsulation and leaks all over the tests, making them harder to maintain.

Maybe you repeatedly created all your fixtures within each test case. At least we did. This is one of example of test fixture used to test a use case related to the Documents micro-service:

How we fixed

We used a pattern called Object Mother, as described by Martin Fowler. It’s an object factory that carries common data needed by multiple tests. Object Mothers have essentially one job: give birth to good defaults and test-ready entities. All this using the less code replication as possible by reusing its own methods. Since the Object Mothers have several good methods for modifying specific groups of things and putting an entity in some specific state, it makes a lot of sense to expose those methods, allowing the developers to modify specific things required for specific tests.

This is the result of the above test fixture using Martin Fowler’s object mother:

Reduced test fixture using object mothers

Dependencies

The state before

We use Application Services to orchestrate our use cases, and at the moment, we had one that had a lot of use cases methods related to the Documents aggregate, like “Request new document revision” and “Approve document revision stage”. This service had a lot of dependencies.

The System Under Test instantiation was being repeated all over the code.

Again, the new keyword was repeated all over the tests. When I added a new dependency to this service, the application build broke with 119 errors, even though we use dependency injection in our code.

How we solved

We use Builders and Factories to assemble the System Under Test (SUT) we want for a given test case.

Abstracting the new keyword with builder.

By using these abstractions you:

Keep the new keyword in a single place, inside the builder
May define default dependencies to a SUT
Avoid passing dummy/null dependencies to a SUT
Override only the dependencies need to a given test case

We use fluent builders for classes that have a many dependencies and standard factory methods for classes that haven’t.

Assertions

Assertions can be hard to understand, depending on the test nature and the abstractions the system under tests has. For example, regression tests often have confusing assertions that don’t tell what it is ensuring. In other cases, its just the nature of the test to have some hard to read assertion. For example, in the assertions above, what does lastStage.Order = 2 mean?

That’s why we use the library Fluent Assertions for test assertions. Besides that fact that it its more BDD (Behavior Drive Development) oriented, you can justify assertions that dont look obvious. Below is the result of rewriting the above Xunit assertions with Fluent Assertions library:

Assertions using the Fluent Assertions library

This library is written in C# but most languages have a fluent assertion library.

Mocks

Every time you mock a dependency to make it return something you want, you are coupling your use case to a implementation detail. For example, for the test to cancel a document revision, we mock the Document Business Engine dependency:

Why does the system under test need to call business? It certainly has a relevant rule to this use case, so why to hide this rule and make the test less of a documentation? Besides that, if you change the business rule engine to another class, the tests are going to break even though the use case still works, making the refactoring harder instead of easier.
So what should you mock?

Mock only unstable dependencies

What are unstable dependencies? Dependencies that depend on infrastructure, third party services, out of process components, or are non deterministic. Unstable dependencies may include, but are not limited to:

Message Brokers
Email services
HTTP requests
Database access
Caching
Logging
Monitoring
Third party services
File system access
Date/Time libraries
Random number generators

Our business rules engine is not a unstable dependency, so we stopped mocking it, making the SUT work like a real use scenario, and writing more test scenarios covering different inputs, documenting the use case through our test suite. Which leads to our last and most important improvement.

Test cases

We often wonder what to test. Should I test this String Helper? Or this Redux Saga? Remember every test has a cost, including CI pipeline time, development time, developer experience. This cost must come with value.

Spend the best efforts by testing use cases

Valuable tests ensures the application works with minimal false positives and little effort. Using mocks makes the test easier to write, but increase the chances of false positives. Testing implementation details, i.e. auxiliary classes used inside you use cases orchestration classes (or functions) don’t ensure the application works. That’s why you should focus your efforts on testing the use case public API surface rather than testing that a mock was called, or even if a data repository returns the right set of data.

Thankfully, there is an agile testing technique that helps us in writing self-documenting and valuable test cases.

Let the (B)ehaviour (D)rive the test cases (D)evelopment

Behavior Driven Development is a technique that helps us testing components guided by the user’s behavior. Even though BDD is better suited to end to end acceptance tests, you can base your unit tests on the behavior of the use cases nevertheless. Use an ubiquous language to describe your tests, avoiding to describe implementation details such as “when file payload is null”. Prefer “when user deleted the file”.

Use BDD scenarios

We also stopped using the method name as the test scenario description and started making heavy use of BDD scenarios:

GIVEN a precondition
AND another precondition
WHEN i do something
THEN a post condition happens
AND another post condition happens

It doesn’t work only for use cases scenarios. For example, you could use a BDD scenario for testing the behavior of a list:

GIVEN that a list has an item
WHEN i remove an item
AND i try to get another item
THEN the list should return empty

Result

This is a test scenario using these patterns. We know there’s a lot of room for improvement, and at Qualyteam we are constantly doing so.

You can notice

The BDD style scenario description tells us exactly what we are testing
The absence of the new keyword
Fluent assertions explaining an odd but necessary assertion (the last one)
The lack of mocks being set up

Wrapping up

You might be wondering if use cases tests without mocking collaborator classes are actually integration tests. We usually don’t care about labels, but we consider these tests sociable unit tests. The point is: do your unit test suite allow you to refactor implementation details without failing while giving you the confidence that your application still works?

I’m a big fan of Kent Dodds, a prominent Javascript and React developer. In the frontend context, he advocates for unit tests that behaves more like functional tests, testing the user’s behavior. He even wrote a testing library for that. So maybe you shouldn’t spend so much effort testing implementation details, like a Redux saga, and focus more on what that saga does for the user behavior. We used his mindset to refactor our tests, but adapting it to the backend.

Also, I want to thank my fellow developers Pedro Ramos and Leonardo Prange for helping with the R&D job.

How we fixed our flaky unit test suite

Fixtures

The state before

How we fixed

Dependencies

The state before

How we solved

Assertions

Mocks

Test cases

Let the (B)ehaviour (D)rive the test cases (D)evelopment

Use BDD scenarios

Result

Wrapping up

Written by Marco Nicolodi