What’s the problem ?
Recently, I was in charge of some heavy migrations in my company and I had to find a solution to make it smooth.
We work with Capistrano, and I couldn’t lock the deployment process for too long, which was the biggest problem. Some of those migrations were structural, and some were data related, updating hundreds thousands to millions of rows in the database.
I ended up writing a migration flow which was a mix of raw SQL, database schema changes through ActiveRecord, and asynchronous workers launched either from the migration itself or some rake tasks. Everything was in a specific order starting from the heavy changes, to the structural, and ending with several consistency check-ups.
It worked, and I usually do that but writing this flow and making it work altogether was a real pain in the ass. I know I cannot avoid this kind of processes and that’s part of the scaling phase of a company, but burning this time to go back to raw SQL queries, writing different types of workers and rake tasks, was something I felt could be partially improved.
I decided to search what solutions are up there, and it seems lots of company just build up their own mechanisms depending of their structure. Small projects are too small to worry about this, bigger companies just write their own system.
But what about the guys in-between ? Not much solutions can be found than what I was already doing.
So I decided to write a gem.
What’s my solution ?
I wrote RailsAsyncMigrations which’s a very simple extension of ActiveRecord::Migration
It does not require heavy work on your end, since it goes through ActiveRecord, and can be customised pretty easily too.
It’s not for all migrations, but when you need to slowly move data, sanitize or change the database structure without needing the result right after your deployment, this helps quite a lot to ease the coding process.
It works in 5 minutes … I swear !
For this example, I’ll choose Delayed::Job as reference, but this can be changed in literally one line to Sidekiq if your configure in your project.
Make a new project
Open up your console and go in your projects directory, start by typing
rails new my_project
Then, add Delayed::Job to your project Gemfile and its only dependency
Go back to your console and type those lines to install it completely
rails generate delayed_job:active_record
Everything went good so far ? You can now install RailsAsyncMigrations by adding it to your Gemfile as well
You also have to install it. We will add a simple table needed to keep the state of our future asynchronous migrations.
rails generate rails_async_migrations:install
Use it now !
That’s pretty much it, you’re good to go and can add migrations. Let’s see an example.
rails generate migration "this_is_a_test"
Check out the last file in
db/migrate/ and change it this way
class ThisIsATest < ActiveRecord::Migration[5.2]
turn_async def change
create_table 'tests' do |t|
What happens then ? The
turn_async keyword will tell ActiveRecord::Migration to use our parallel migration queue instead of just running everything in the same process.
Let’s run it and see.
$ rake db:migrate
But the migration was run ?! Yes, with RailsAsyncMigrations we go through the classical migration run, but the methods such as
down will have their content ignored from the process.
rake:db rollback it’ll also be taken away. So you can go in any order or direction and use your migrations commands without worrying about the one you turn asynchronous.
At this point, a row has been added to our table
rails_async_migrations and is ready to be run, the starting state is
created ; it will go through multiple states.
The whole point of this library is to avoid building up too much on your side for something which’s a pretty simple concept. Under the hood, a lot is happening, but you don’t need to worry about that.
If Delayed::Job is already running on your machine, chances are the migration is already done, since it was pretty simple to process.
In any case, make sure the queue will be run by writing this in your console
$ bin/delayed_job start
It was that simple. All the migrations you
turn_async will now be launched via Delayed::Job in this parallel pipe.
What should I use it for ?
Be aware turning your migrations asynchronous will add difficulties; ensuring data consistency when building your migrations is crucial.
This quick example was using ActiveRecord::Migration functionalities but ignores the idempotency nature of workers systems which you will face while using it in bigger projects.
When using this parallel queue, you have to ensure your data alterations can be repeated multiple times, in case the worker is being run again. Adding conditions to see if something was already added, or being careful with the querying does the trick.
I personally use it to alter data which can be slowly updated, without risk of breaking them with multiple run.
MyModel.where(something: true).find_each do |my_model|
my_model.update! something: false
What if I’ve got multiple migrations ?
It would be a total mess if all workers were launched at the same time, so there’s a queue, with a specific order of execution, the same way as the synchronous ones.
Once a migration is
done, it goes to the next one, if it’s
failed, it locks the process and will use the natural retry system of your workers, and eventually pass to the next one.
If there’s failure, check the logs of your worker. You can fix the problem, and l et it try again, or change its state in the database and manually recheck the queue via
What if I mix synchronous and asynchronous migrations together ?
Any migration you don’t turn will act like it should in its original way. You can put
turn_async on any migration you want but also keep some synchronous ones in-between, it’ll just move some to the new queue, and keep running the others through the classical process.
What if I want to use Sidekiq ?
Once again, just add up a bit of configuration
RailsAsyncMigrations.config do |config|
config.workers = :sidekiq
Extending the principle
async_schema_migrations is very straight forward. You can build up a view to check the state of different migrations, and enforce some to be removed or updated without them passing.
If you want to go further with this, or want to ask me some extra feature, don’t hesitate to contact me. This gem is fresh, and I hope people will like it !