Hunting flaky tests 4: Incomplete Teardown
Included in our list of case studies involving flaky tests encountered at Doctolib are problems related to teardown. Below you will find out why these instances can be troublesome, various examples of such issues and possible solutions to the problem.
In a perfect world, each test would be run in its own isolated environment. In reality, however, the cost of mounting and starting an entirely new environment forces us to use the same one for the whole test suite. Between each test we run some setup/teardown in an attempt to simulate a pristine environment for each.
Just like in public bathrooms, each test should leave the environment (a) just as it found it. If even one teardown does not reset its opposite setup, the rest of the test suite will run within a different environment (b) than before the test was run.
Since tests are run in a random order, this could mean that any test could run randomly on one of the different contexts (a or b), resulting in possibly different outputs, and thus flakiness.
Examples
Multiple windows
Edit (2018–03–21): This problem has been addressed by Capybara team in this pull request. You can now safely use Capybara.reset!
, event with multiple windows.
At Doctolib we use a Chrome extension that establishes communication between multiple tabs. Since we want to cover its behavior by E2E tests, we need a Capybara test that opens multiple tabs. This is accomplished using the Capybara open_new_window
function.
In a single window test, you simply call Capybara.reset!
during teardown. This method is responsible for stopping pending Ajax calls. To do that, it visits the page ‘about:blank’, a standard blank page without any Javascript executing, thus cancelling the possibly pending Ajax calls. Once this is done, you can clear your database safely.
Unfortunately, Capybara.reset!
only works for the current window, so if you have other windows opened, they will continue to execute their Javascript. During the rest of the teardown, these scripts can trigger Ajax calls to your backend. Since you also clean the database during teardown, these calls are likely to fail, resulting in a flaky failure of the entire test.
Safari webdriver does not clean “httponly” cookies
Some tests require a user to be authenticated, which is usually done during test setup. During teardown, we rely on Capybara.reset!
to clear cookies and logout this user. Sometimes, if the cookies haven’t been properly cleared the following test can start with an already logged-in user. If it also need to login, the test will fail.
This is precisely the case with Safari: our authentication token is stored as an “httponly” cookie, and the Safari webdriver does not seem to delete this type of cookie. So, on Safari, after a test including authentication, the next test to run will still have an authenticated user and would fail as described above.
Solutions
Multiple windows
We ensure Ajax calls are finished on each open tab before clearing the database. We iterate on each window to reset (with Capybara.reset!
) and close it:
teardown do
Capybara.windows.reverse.each do |window|
Capybara.switch_to_window(window)
Capybara.reset!
window.close unless window == Capybara.windows.first
end
end
Note that we do not close the first window. Capybara opens it only at the start of a test batch, closing it causes the following tests to fail.
Safari webdriver does not clean ‘httponly’ cookies
As a quick fix, during teardown, if we are on Safari, we visit a page that clears this cookie from the backend side.
The problem is discussed here and here. It seems that this issue has been resolved in the new official Safari webdriver.