Rails: make sure you have proper DB indexes for your model’s unique validations

In this article I’m going to explain you what performance issues you may have when use unique validations in Rails models.

Problem

Let’s start with a small example. We have the following User model:

class User < ApplicationRecord 
validates :login, uniqueness: true
end
class CreateUsers < ActiveRecord::Migration[5.1]  
def change
create_table :users do |t|
t.string :login
t.string :full_name
      t.timestamps    
end
end
end

and the following UsersController:

class UsersController < ApplicationController
def create
@user = User.new(user_params)
  unless @user.valid?
render json: {
status: 422,
message: "Validation error",
errors: @user.errors
}.to_json
return
end
  @user.save!
  render json: {
status: 200,
message: "Created"
}.to_json
end
private
def user_params
params.require(:user).permit(:login)
end
end

Now let’s try to create 2 users with the same login.

curl -X POST http://0.0.0.0:3000/users -d user[login]=garry

When perform a POST request 2nd time you will get the following error:

{“status”:422,”message”:”Validation error”,”errors”:{“login”:[“has already been taken”]}}

This is correct, but let’s check what is going on under the hood — let’s look into Rails console. The following set of logs are there:

Started POST “/users” for 127.0.0.1 at 2017–09–13 16:16:01 +0300
Processing by UsersController#create as */*
Parameters: {“user”=>{“login”=>”garry”}}
User Exists (0.2ms) SELECT 1 AS one FROM “users” WHERE “users”.”login” = ? LIMIT ? [[“login”, “garry”], [“LIMIT”, 1]]
Completed 200 OK in 5ms (Views: 0.1ms | ActiveRecord: 0.2ms)

As you can see there is an extra request to DB which tries to find a user with provided login. This is exactly how this line “validates :login, uniqueness: true” works. Before each ‘record save’ it performs an additional request to DB to validate the provided data for uniqueness.

Everything works super fast when you have 10, 100, 1000 users. And you even will not notice any performance issues.

But things will start going worse and worse with more users you have. Performance of this API endpoint will degrade.

The following ‘EXPLAIN’ output proves that — this request does not use any indexes hence it scans the whole Users table:

sqlite> EXPLAIN QUERY PLAN SELECT 1 AS one FROM “users” WHERE “users”.”login” = “garry” LIMIT 1;
0|0|0|SCAN TABLE users

Solution

You actually have 2 ways to fix it:

#1. Add proper DB index

To fix it you have to add an index by login field. The following migration does it:

class AddLoginIndexToUsers < ActiveRecord::Migration[5.1]
def change
add_index :users, [:login]
end
end

Let’s check now what the ‘EXPLAIN’ command tells us:

sqlite> EXPLAIN QUERY PLAN SELECT 1 AS one FROM “users” WHERE “users”.”login” = “garry” LIMIT 1;
0|0|0|SEARCH TABLE users USING COVERING INDEX index_users_on_login (login=?)

Bingo! Now our validation will work quite fast and does not matter how many users we have. Because now we use an index!

#2. Move unique validation to DB level

This time we can go further — we will move our unique validation to DB level. Let’s remove this line from Users model “validates :login, uniqueness: true” and add the following migration:

class AddLoginUniqIndexToUsers < ActiveRecord::Migration[5.1]
def change
remove_index :users, column: :login
add_index :users, [:login], unique: true
end
end

and modify a little bit our controller’s ‘create’ action:

def create
@user = User.new(user_params)
  begin
@user.save!

render json: {
status: 200,
message: “Created”
}.to_json
rescue ActiveRecord::RecordNotUnique => e
render json: {
status: 422,
message: “Validation error: login has already been taken”,
}.to_json
end
end

Now let’s try to create a new record with existing login — we are getting the same validation error response. But let’s look under the hood, what’s going on in Rails console:

Started POST “/users” for 127.0.0.1 at 2017–09–13 16:59:01 +0300
Processing by UsersController#create as */*
Parameters: {“user”=>{“login”=>”garry”}}
(0.0ms) begin transaction
SQL (0.4ms) INSERT INTO “users” (“login”, “created_at”, “updated_at”) VALUES (?, ?, ?) [[“login”, “garry”], [“created_at”, “2017–09–13 13:59:01.333136”], [“updated_at”, “2017–09–13 13:59:01.333136”]]
(0.0ms) rollback transaction
Completed 200 OK in 11ms (Views: 0.1ms | ActiveRecord: 1.3ms)

There is no extra DB query now, but effect is the same — it works quite fast the big number of users :) .

Which solution to choose

In my opinion, the first one is good because it obviously shows that our model has an unique login validation hence you do not need to dig into schema.rb to understand this. But 2nd solution is better because it does not perform an extra query to DB. So choose that one the suites better for you. If you need extra performance then probably 2nd approach is better for you.

UPD1:
Well, sometimes things can go wrong when use Rails uniqueness validations and you actually will find NOT unique data in your DB :( . You can find more information why it happens in this article . From my own experience, we recently migrate from Rails uniqueness validations to unique DB indexes. With 30M records we found 20K not valid and our Rails migration failed. That was a lot of pain then to clean it… I hope the Rails team will come up with some better solution in near future, otherwise this feature is not strong enough and can lead to many issues. So I recommend to use 2nd solution from day1.

How to identify these issues in your existing project

I made a small gem that helps find above issues with missed indexes: https://github.com/soulfly/unique_validation_inspector

Let’s run it on an initial version of the project (w/o indexes):

$ rake inspect_unique_validations
You have the following unique validations:
Model ‘User’:
[:login] (scope ‘’). Index exists: false

It tells you that you do not have a proper index for ‘login’ validation. So now you know how to add it.

It’s very useful to use this gem in a big projects where it’s just impossible to analize what indexes are missed.

Conclusion

You can download this project source code https://github.com/soulfly/unique_validation_inspector_demo

Please ask me if you have any questions. Cheers!

A single golf clap? Or a long standing ovation?

By clapping more or less, you can signal to us which stories really stand out.