Race Conditions Solution for Rails Uniqueness Validations


It’s known that Ruby on Rails Uniqueness validations have many issues with Race Conditions. I will use our own example to clarify what I mean:

We are using Sidekiq for our asynchronous jobs. We have an endpoint /api/parse/?data= that receives lots of parsing requests from another server and enqueues an asynchronous job for each of them, so we have lots of parsing jobs enqueued per second. All those jobs generate new records in the database that may result in a duplicated row. When we do the save! action in Rails the Uniqueness validation doesn’t fail because internally the save! method makes the validations by first making a database select, and the other parsing jobs are doing the same at the same exact time, so none of them have inserted to the table at this point. You can see an example of this here. So if Rails validations don’t guarantee us uniqueness checks, how do we keep our data consistent in the database?

See also: Dynamic environments for your gulp tasks

Unique Index

The first solution you can come across when you research this is using unique indexes in the database. Let’s see an example:

We have an ActiveRecord Document like the following:

With that Rails validation, if we insert some data to the documents table almost at the same time, it would end up looking like the following:

That happens because we have a Race Condition in the insertion of Documents. We would have a row that for our Rails validations is invalid but it would be persisted anyway, making our database inconsistent. Even if we run Document.find(4).valid?, it will return false and the record will still be there. That’s worrying, right? So how do we create an index that guarantees consistency in the database? You can create and run the following migration for our documents table:

So with our new index, the Rails validation will still fail but when the insert query runs in the database, it will raise an ActiveRecord::RecordNotUnique exception. If you want to avoid having all those errors in Sidekiq, you could catch them and do nothing.

Now… what happens if you have a huge table that already has lots of indexes and the index you need to add has lots of columns on it? For that scenario, we came up with the next solution.

See also: Deploy your AngularJS app to AWS S3 with SSL

Unique Token Column

You can generate a unique token formed with the values of the columns used in the uniqueness validation. To generate the index you could run something like this:

Digest::SHA1.hexdigest([self.name, self.description, self.date.to_s].join)

And save it in a text type column with a unique index.

We had this issue with lots of models and we didn’t want to start duplicating all that code everywhere, so we came up with the following code:

So… by including the UniqueTokenValidator concern you have a token generated every time the model is validated and saved. That means when it is created or updated. You just need to create the UNIQUE_FIELDS constant with the fields you want to validate and the condition if you have one. Also if you still want the Rails validation, just add it to the model like above.

Take into account that if you are going to use this workaround with a model that constantly changes the columns that are included in the unique token, you will need to create a similar plain SQL approach to generate the hash again every time, because iterating through all your rows with Rails can take some time. For any other model you will be okay by just including the UniqueTokenValidator and your unique column. Pretty easy, right?

Posted by Esteban Pintos (esteban.pintos@wolox.com.ar)