Preventing erroneous Rails migrations

Dave Allie
Tanda Product Team
Published in
4 min readJun 22, 2019

At Tanda, we’ve never really had a smooth ride with migrations, a medium-sized development team, and shared development resources (such as a common database for staging data and development).

Alex wrote about some of our woes with structure.sql which you can check out here, and until recently, one of the bigger risks of our development workflow was the possibility for migration code that has never been run before to make it into production. I’m going to dive into how it can happen, and what we did to prevent it from happening.

After they’re created, database migrations go through a few different checkpoints at Tanda before making it to production. These are the three big ones:

  • Getting run on the developer’s machine
  • Passing code review
  • Getting run on a staging site

Between each of these steps, the migration might need to be adjusted by the developer, and depending on when the migration is changed, it’s possible for those changes to slip out into production with only the review step. Here’s how.

Breaking a Migration

I have my great new migration here:

class MyMigration < ActiveRecord::Migration[5.2]
def change
MyModel.where(my_field: 1).update_all(my_field: 2)
end
end

Fantastic, it looks like it’s going to work, it runs fine locally, my structure.sql or schema.rb is updated to include my new migration version. I commit the migration and the changes to the structure file, tests are run, everything passes, and it is deployed onto a staging server.

However, when it’s reviewed, my reviewer suggests I use MyModel.where(my_field: 1).batches.update_all(my_field: 2) because the table is big and the update should be broken into many smaller updates. Sound’s good, I update the migration file locally and commit again.

class MyMigration < ActiveRecord::Migration[5.2]
def change
MyModel.where(my_field: 1).batches.update_all(my_field: 2)
end
end

There was no change to my structure file, so nothing to commit there. Again, tests pass, it’s deployed, but this time, the migration isn’t run during my staging server deployment because it has already run. My reviewer comes back, approves my changes, and I merge into master where the production deploy kicks off.

If you haven’t seen the issue with the migration yet, I don’t blame you, however, Ruby doesn’t have much in the way of a forgiving nature. When the migration is run in production, it crashes, complaining that batches isn’t a method. In fact, it’s right, the correct method is in_batches , so what went wrong, how could this have happened?

Primarily, it was a process issue. The developer should have re-run the changed migration locally and the reviewer should have been more sure about the changes they were suggesting. If there’s one thing development tools are good for, it’s removing process problems.

Fixing the Issue

All the structure file does is let us know that the migration has run, at one point. What we needed was a way to signify that changes to a migration had been run, and the way we did that was with a concept called a ‘migration signature’.

We needed two things to make this work, a place to put the signature, and a way to generate it from the migration source. With those two, a rough workflow it all together looks like this:

  1. When the ‘up’ part of the migration is run locally, we analyse the migration file and generate a signature based on the contents
  2. We save that migration signature somewhere
  3. In CI, we analyse the migration files again and compare them to their signatures
  4. If the expected and actual signatures don’t match, we know that a migration has been changed and has not been re-run

Storage was the easiest part of the puzzle, we would store the migration signature in the file as a comment. In the case of our original migration above, after running it, this is what the file would look like:

# migration_signature: xxxclass MyMigration < ActiveRecord::Migration[5.2]
def change
MyModel.where(my_field: 1).update_all(my_field: 2)
end
end

This way, migration signatures are right next to their migration, making checking and debugging a lot easier.

Generating the migration signature was a little bit less straight forward. The first idea was to use a hash of the migration file, however as we store the signature in the same file we’re signing, if we tried to sign the same file twice, we’d get two different signatures. So instead, the AST of the file was extracted and hashed to get the signature, this has the added benefit that any non-ruby changes will not cause a change in the signature. So signing a signed file, adding/removing comments and whitespace will not change the signature.

Going back to the original example, when the migration is changed and not run again, the signature no longer matches the file source.

# migration_signature: xxxclass MyMigration < ActiveRecord::Migration[5.2]
def change
MyModel.where(my_field: 1).update.update_all(my_field: 2)
end
end

So in CI, when checking migrations for signatures, we can flag this migration and mark the build as failed with an error message:

Missing or invalid migration signature in migration: 20190621040327_my_migration. 
Please re-run your migration to receive an updated signature.

That’s all there is to it. With a small utility, we have protected against a process problem, and encouraged good development practices at the same time.

We opened sourced this tool and made it into a gem for anyone to use, you can find it here:

If you have come across this issue before in your development team and navigated around it in a different way, I’d love to hear about it, drop a comment down below.

--

--