Flaky Tests — How to Overcome Testing Nightmares.

3 min readAug 15, 2024

Have you ever created or experienced software tests that yield both a passing and failing result, across multiple test runs, but without changing any written code?

That is the definition of a flaky test.

If it sounds like a nightmare to deal with, it was, but not anymore!
See https://flakeguard.com/docs

What causes flaky tests?

Often, flaky tests are due to nondeterministic factors like timing issues — often with asynchronous code, race conditions, network variability, test order dependency, reliance on external systems or service responses, etc. These factors make debugging extremely difficult and can translate to unwanted issues for the end user.

Why flaky tests matter?

At Google, of the 115,160 test targets that had previously passed and failed at least once, 41% were flaky. Of the 3,871 distinct builds sampled from Microsoft’s distributed build system, 26% failed due to flakiness.

Test flakiness has been reported within all sectors of the software engineering field, from independent open-source software projects to the proprietary products of some of the world’s largest companies (Lam et. al., 2019).

Did you know?

“Historical patterns of flaky tests in Chrome, identified that 40% of flaky tests remain unresolved, while 38% are typically addressed within the initial 15 days of introduction” (Malmir & Rigby, 2024).

Developer impacts of flaky tests

Reduced Confidence & Efficiency: They undermine trust in the test suite, as developers can’t rely on the unpredictable and contradicting results to reveal the true state of the code.

Masked Issues: Real bugs might be overlooked or ignored, assuming test failures are just flakiness or vice versa.

Increased Maintenance: More effort is needed to maintain and stabilize the test suite, which can slow down development cycles..

Continuous Integration Disruptions: They can cause interruptions in automated build and deployment processes, delaying releases.

Best practices to AVOID flaky tests

Ensure tests are isolated to avoid shared state dependencies.
Control timing with fixed timeouts (adjusting wait times) and avoid relying on fluctuating conditions.
Mock external dependencies for consistent interactions and reduce variability.
Use stable, controlled test data to prevent unexpected input changes.
Run tests in a repeatable environment for consistency and reproducibility, boosting confidence in results and accelerating development cycles.

How to OVERCOME flaky tests

Developers may find it challenging to sift through many lines of code to manually look for root cause of flake, or developers can re-run tests several times and document contradictory behaviors — however these methods to overcome flake cost time and money.

Thus, utilize a flaky test identifying software, such as FlakeGuard to quickly identify, flag, and potentially fix the flake before losing valuable debugging time.

FlakeGuard is a new open source product launched August 15th 2024, to learn more about the benefits and quick installation to Flake Guard, see below, or read our Medium article.

View FlakeGuard Results and Dashboard:

Install FlakeGuard:

npm i flake-guard

Run FlakeGuard

npx flake-guard <filename>

Open Results in FlakeGuard Dashboard
Once tests have successfully run, simply press “enter” on your keyboard to be redirected to FlakeGuard’s webpage.

Works Cited

W. Lam, P. Godefroid, S. Nath, A. Santhiar, and S. Thummalapenta. 2019. Root causing flaky tests in a large-scale industrial setting. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’19). 204–215.

S. Malmir and P.C. Rigby. 2024. Predicting the Lifetime of Flaky Tests on Chrome. In Proceedings of the 1st International Workshop on Flaky Tests (FTW ‘24). Association for Computing Machinery, New York, NY, USA, 5–13. https://doi.org/10.1145/3643656.3643899