Allow re-running failed tests easily in WebdriverIO

Michael Salvia
JW Player Engineering
3 min readAug 18, 2020

At JW Player (GitHub) we run 10,000+ tests a day for our namesake product, which touches 150–200mm unique users per day. Given the scale in which we operate and the variable nature of the web via desktop or mobile devices, we need to make sure our tests are robust and capture actionable data from our nightly and release test reports.

Our Test Engineering team uses WebdriverIO (GitHub) with Cucumber for all playback products. Though most of our tests pass and are viable, we occasionally have blips. These blips can provide false positives and create failures unrelated to our product. This requires test engineering to spend additional time and effort, often simply to say, “not an actual issue.” This doesn’t just waste a test engineer’s time, but it can also delay PRs from being merged and new versions from being released.

As an example, our engineering teams heavily on integration test reporting for each pull request (PR), and these tests are used to give the “go/no-go” on our releases. Allowing a re-run of failing tests allows for us to determine what issues need to be addressed, and which issues are a result of a blip in the system performing the test.

So if you use WebdriverIO and have ever:

  • Had a PR that was failing a flakey integration test but was impossible to reproduce locally
  • Come in the morning to a worrisome nightly report filled with false positives due to network connectivity
  • Written integration tests for software that relies on multiple external resources including network connectivity, RAM usage, CPU speed, unmanaged rate-gated network resources

Today, we are happy to announce the release of wdio-rerun-service (GitHub) to the WebdriverIO community.

This service tracks failing tests and scenarios, allowing failing or unstable tests or scenarios to be re-run.

Of course, it is the best practice to write reliable tests, predictable tests that will never break. Sadly, depending on the software-under-test, this is often not the case.

Service lifecycle

The diagram above walks through the service’s lifecycle. As a worker is running it is keeping track of the instabilities found in the afterScenario hook they are being logged in real-time to a JSON file for that specific worker.

Example:

[
{
"location": "basic/basic.feature:4",
"failure": "Error: Session not started or terminated"
}
]

After all the workers have finished and the onComplete hook is executed. The service will look for any re-run data files available and construct a Bash script (default: rerun.sh). This file will be generated using thewdio command from the primary execution and add all unstable spec files as arguments to this command.

Example:

DISABLE_RERUN=true wdio ./wdio-configs/chrome.conf.js --spec=basic/basic.feature:4

The environment variable, DISABLE_RERUN, is set so processing is disabled during the re-run itself.

The service was crucial in stabilizing test reporting for JW Player’s SDK offerings. Originally, a test engineer per platform, for iOS & Android, would spend 1–2 hours a day working through reports, determining if something were a new bug or a false positive due to any number of external factors, and this would be for a single platform version. Today, a test engineer is able to work through 6 versions of Android OS or 4 versions of iOS test reports in less than an hour.

While it would be great if all tests passed the first time, and while a re-run has the potential to hide sporadic issues or flakiness, in the modern world of “connected everything” there are simply too many external factors. When working with 10,000+ tests, spanning 2 products, on 20+ platforms, having a re-run has allowed JW Player to focus on what is failing due to potential product issues and not any number of external factors.

Ready to get started? Check out the README.md for installation and configuration details.

--

--