ActiveRecord::Relation::Union pitfall that could cause a disaster

Akihiko Ito
Getsafe
Published in
2 min readAug 3, 2022

What happened?

I was refactoring a piece of code that was iterating over a big chunk of data. The operation looked like:

contracts = scope1 + scope2
contracts.each do |contract|
...
end

This is a somewhat typical, inefficient way of querying and processing large amount of data. With the expression scope1 + scope2, all the records are loaded into the memory at once.

So using active_record_union gem, I changed it to

contracts = scope1.union(scope2)
contracts.find_each do |contract|
...
end

This way, scope1.union(scope2) doesn’t actually run the query, and contracts.find_each queries records in batches with sensible size.

Now, scope2 is a method that returns either an instance of ActiveRecord::Relation or an empty array.

def scope2
return [] if ...
Contract.some_scope
end

This worked fine in the pre-refactor code — except for the inefficiency — because scope1 + scope2 would just be a concatenation of two arrays.

However, the post-refactor scope1.union(scope2) can return quite an unexpected result.

Try the following:

Contract.none.union([]).to_sql # replace "Contract" with any ActiveRecord model you have

and you’d actually get

"contracts".* FROM ( (SELECT "contracts".* FROM "contracts" WHERE (1=0)) UNION (SELECT "contracts".* FROM "contracts") ) "contracts"

If you look at the part after UNION, you’ll realise it’s just querying all contracts.

You can imagine how this could lead to a disastrous scenario, for instance, it could result in sending an invoice to ALL customers instead of those who’ve just purchased something; or hitting an external API for every existing contract instead of just a tiny subset of contracts and effectively DDoS-ing their system.

What happens under the hood?

When you do Contract.some_scope.union([]), active_record_union converts the empty array into Contract.where([]) (*), so the statement is equivalent toContract.some_scope.union(Contract.where([])).

Now, active_record treats where with a blank-ish argument as “no-op” (*) meaning Contract.where([]) would be equivalent toContract.all.

And voilà, Contract.some_scope.union([]) is equivalent toContract.some_scope.union(Contract.all) .

What should I do to avoid disaster?

Don’t use an empty array trying to emulate an ActiveRecord::Relation that’s guaranteed to return an empty result. UseActiveRecord::QueryMethods#none (e.g., Contract.none) instead. The method scope2 should now look like this:

def scope2
return Contract.none if ...
Contract.some_scope
end

Plus, writing tests that make sure things that aren’t supposed to happen aren’t happening, would help prevent inadvertently introducing a code that fetches and processes all records.

--

--