Rails — find_each v.s find_in_batches v.s in_batches

涓 / Lynn Chang
Lynn’s dev blog
Published in
2 min readMay 5, 2021
Photo from Unsplash

Sometimes we will use klass_name.all to get records from the database.

That’ll be ok, if we only have 100 records in the database.

However, using klass_name.all might not be the best way to get records, especially when we need to query large numbers of records.

For example, when we have 1 million records and using the query below, ActiveRecord will instantiate all the objects at once. The memory consumption will increase quickly. The worst case is the application will be unable to load any additional program.

Project.all.map { |p| p.do_something_great }

Rails provides find_each , find_in_batches, and in_batches these three public methods to work with the records in batches, which helps reduce memory consumption.

What’s the difference between the three of them? Let’s see!

find_in_batches

Generally, if we do not specify the size of the batch, the default batch size is 1,000.

For example, there are 3,000 records, No.1~1,000 records will be the first batch, then No.1001~2000 will be the second batch, and so on.

If the block isn’t given to find_in_batches, it returns an Enumerator:

Project.find_in_batches.class
#=> Enumerator < Object
Project.find_in_batches.first.class
#=> Array < Object

If the block is given…

After records of each batch finish project.do_something_great!, the type of projects will be changed to an array.

Project.where(status: 'success').find_in_batches do |projects|
projects.each { |project| project.do_something_great! }
end

find_each

The same as find_in_batches , the default batch size is 1,000.

If the block isn’t given to find_each, it returns an Enumerator:

Project.find_each.class
#=> Enumerator < Object
Project.find_each.first.class
#=> Project < ApplicationRecord

If the block is given, it will call find_in_batches.

# File activerecord/lib/active_record/relation/batches.rb, line 68def find_each(start: nil, finish: nil, batch_size: 1000, error_on_ignore: nil, order: :asc)
if block_given?
find_in_batches(start: start, finish: finish, batch_size: batch_size, error_on_ignore: error_on_ignore, order: order) do |records|
records.each { |record| yield record }
end
else
#....
end
end

According to the source code, we can get the same result from the two queries below.

So, if we would like to iterate in batches (it does the each from above for us), we can use find_eachas a shortcut.

Project.where(status: 'success').find_in_batches do |projects|
projects.each { |project| project.do_something_great! }
end
Project.where(status: 'success').find_each do |project|
project.do_something_great!
end

in_batches

The default batch size is 1,000, too.

If the block isn’t given toin_batches, it returns a BatchEnumerator.

Different from find_each and find_in_batches return Enumerator, in_batches returns BatchEnumerator, and the type of each record is an ActiveRecord_Relation object.

Project.in_batches.class
#=> ActiveRecord::Batches::BatchEnumerator < Object
Project.in_batches.first.class
#=> Project::ActiveRecord_Relation < ActiveRecord::Relation

If the block is given…

Yields ActiveRecord::Relation objects to work with a batch of records.

Project.where(status: 'success')in_batches do |projects|
projects.update_all(status: 'draft')
end

--

--

涓 / Lynn Chang
Lynn’s dev blog

A software engineer who loves writing and cares about mental health and life meaning.