Eliminating Nondeterministic (“Flaky”) Tests in Ruby and RSpec
At Panorama, we strongly believe in automated tests. No code gets deployed without passing both a thorough code review and a battery of thousands of automated tests, and no code makes it through code review without updating tests for the new features it’s adding. Automated tests help us build new features and refactor old code without introducing bugs, and are a big part of the reason we are able to confidently deploy new changes to our production apps multiple times a day.
What are flaky tests?
But with an automated test suite comes the dreaded possibility of tests that fail nondeterministically-that is, most of the time they pass, but every once in a while they fail for no obvious reason. When the tests are retried, they pass again. Flaky tests.
Flaky tests reduce developer confidence in the test suite, waste our time when we need to have tests retry until they succeed, and can delay the release of changes or even critical bugfixes. As a result, we take a hard stance and make sure to squash flaky tests whenever we see them.
Common causes of test flakiness
Over time, we’ve found that in our codebase flaky tests tend to have one of three causes:
Cause #1: Tests share state
In RSpec this typically means we’re creating something in our database in a before(:all)
block rather than a before(:each)
or let
block (since we use RSpec’s transactional fixtures, any database insertions or updates that happen in a before(:each)
/let
are reverted after the test executes).
To track down these instances, we’ve eliminated before(:all)
from our test suite from all but a few special instances. In addition, we've added a special after(:all)
block that can run after each test file and check whether anything has been left in the database:
# A simplified version of our after(:all) check.RSpec.configure do |config|
if ENV["DEBUG_TEST_CLEANUP"]
config.after(:all) do |example|
ApplicationRecord.descendants.each do |klass|
if klass.unscoped.exists?
puts "#{klass} was not cleaned up in "\
"#{example.class.description}!"
klass.unscoped.delete_all
end
end
end
end
end
Cause #2: Tests sort auto-incrementing names
We use the fabrication
gem to easily build objects and save them to the database, and fabrication
provides a sequence feature that lets you auto-increment fields. For instance:
Fabricator(:student) do
name { sequence(:name) { |i| "Student #{i}" } }
end# Create three students with names “Student 0”, “Student 1”, and
# “Student 2”
let!(:students) { Fabricate.times(3, :student) }# Test that our results are in reverse-sorted order by student name.
it { is_expected.to eq students.reverse }
But since our tests run in a random order and these sequences are global, the above code could generate students with names "Student 73"
, "Student 74"
, and "Student 75"
, or any other sequence of integers, depending on how many previous tests also called Fabricate(:student)
.
Since these strings are then used for sorting, we’ll run into problems when crossing number-of-digit boundaries, like with "Student 99"
, "Student 100"
, and "Student 101"
. In that case, the reverse alphabetical sort would be "Student 99"
, "Student 101"
, and "Student 100"
, causing our test to fail.
While we could track down places where we’re relying on this sort of automatic naming and sorting and change the tests, we’ve found it was much easier to globally start fabrication
sequences at very high values to avoid this problem:
Fabrication.configure do |config|
# We change all Fabricator sequences to start at this very high
# number to help us avoid nondeterministic test failures where we
# expect two things we're fabricating in a given order to have
# that ordering when sorting by their names (e.g. "Item 3" < "Item
# 4") but at digit boundaries can lead to unexpected behavior
# (e.g. "Item 10" < "Item 9"). Starting at this high number means
# all sequences will have 89,999,999 iterations before they
# encounter this digit boundary case, virtually eliminating the
# probability of nondeterministic test failures caused by
# unexpected sequence name orderings.
config.sequence_start = 10000000
end
Cause #3: Tests manually set the id
(the primary key) of database rows
Doing something like this:
let!(:school1) { Fabricate(:school) }
let!(:school2) { Fabricate(:school, id: 42) }
let!(:school2_student) { Fabricate(:student, school_id: 42) }
might seem innocent, but when the monotonically-increasing id
that the database generates for school1
happens to be 42
, the creation of school2
will raise an error because two database records can't have the same primary key.
This error can be hidden by more subtle code, like:
context “when a student and school have the same database id” do
let!(:school) { Fabricate(:school) }
let!(:other_student) { Fabricate(:student) }
let!(:same_id_student) { Fabricate(:student, id: school.id) } it { is_expected.to be true }
end
The easy thing about this issue is that it’s very easy to spot as a red flag in code reviews: your database should set primary key id
fields, not your application code.
So if you’ve got a test that sometimes passes and sometimes fails, try checking these three things:
- Does the test (or any others) use
before(:all)
? If so, change that tobefore(:each)
orlet
blocks instead. - Does the test check the sorted order of strings that contain digits? If so, change the test or better yet configure
Fabrication
or a similar tool to start these numbers much with much higher values. - Does the test set the
id
of any model? If so, rework the test so it does not.
If none of those issues are the problem, we’d love to hear about them! And if you’re interested in working on a codebase that takes tests seriously, we’re hiring !