Implement Reader/Writer Databases in your apps with Rails 6+ and ActiveRecord

erum sanwari
Railsfactory
Published in
6 min readJan 13, 2024
Reader/Writer Databases with Rails 6+ and ActiveRecord

In extensive projects dealing with massive datasets, a common challenge emerges — READ operations often hit performance bottlenecks. To address database speed issues, a widely adopted practice involves the introduction of multiple databases.

Historically, Rails had the support but lacked native management for this solution until version 5.x. As a workaround, developers turned to gems and external libraries, with Makara emerging as a popular choice. With Rails 6 supporting multiple databases in a single app, this obstruction was finally lifted up!

In this Article, we are going to explore part of multi db implementation — Reader/Writer Databases with Rails. Read till the end for an interesting observation!!

Below is a step by step guide on how to implement and use 2 databases, one acting as primary and the other being a replica of the primary db, used for READ operations. I have created a minimal task tracker application with User, Category and Task models. Below are the Model class pseudocode.

To get the full code, you can check this github repo.

class User < ApplicationRecord
has_and_belongs_to_many :categories
has_many :tasks

...
end

class Task < ApplicationRecord
has_and_belongs_to_many :categories
belongs_to :user

...
end

class Category < ApplicationRecord
has_and_belongs_to_many :users
has_and_belongs_to_many :tasks

...
end

For setting up database replica, modify database.yml to contain config similar as mentioned below. I am using Mysql.

default: &default
adapter: mysql2
pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
timeout: 5000
username: root
password:

development:
# This is the primary database. By default, Rails will write to 'writer'. Name can be modified
writer:
<<: *default
database: task_tracker_development

# This is the replica database
reader:
<<: *default
database: task_tracker_development
username: root_readonly
replica: true

Few pointers to keep in mind

  • Make sure the database names for primary and replica db are same. This allows Rails to identify that both databases are essentially the same, differing only in terms of read and write access.
  • Differentiate the databases by usernames/url etc.
  • Do not forget replica: true for reader db. This is how Rails identify the reader database.

Next I created controllers and views. I also wanted to put in session authentication, so my user is created using registration.

The next step is to connect models to both databases, so that both can be consumed. How do we do that?

ActiveRecord provides us with a connects_to macro which can be implemented in any Abstract Class.

class ApplicationRecord < ActiveRecord::Base
primary_abstract_class

connects_to database: { writing: :writer, reading: :reader }
end

Now, let’s find out how rails will connect and identify which queries will be executed in reader and which ones in writer. ActiveRecord provides a class method called connected_to which takes role as an argument, and we can mention which db we want the code to be executed in. Check the following.

# To use replica database
ActiveRecord::Base.connected_to(role: :reading) do
@tasks = Task.all
end

# To use write/primary database
ActiveRecord::Base.connected_to(role: :writing) do
@tasks = Task.new(params).save
end

Pretty simple right?! But wait, do we need to manually implement this for every query? Well, that would be a nightmare!

Thanks to Rails primitive automatic role switching support, we don’t need to do this manually:) Rails official doc says

Automatic switching allows the application to switch from the writer to replica or replica to writer based on the HTTP verb and whether there was a recent write by the requesting user.
If the application receives a POST, PUT, DELETE, or PATCH request, the application will automatically write to the writer database. If the request is not one of those methods, but the application recently made a write, the writer database will also be used. All other requests will use the replica database.

which means that for GET or HEAD query requests, reader replica db would be used, while for POST, PUT, DELETE, or PATCH requests, it will access writer db. But we need to activate a middleware to bring this into play. Just run

rails g active_record:multi_db

This will create an initializer file called multi_db. In that file, uncomment the following lines.

Rails.application.configure do
config.active_record.database_selector = { delay: 2.seconds }
config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver
config.active_record.database_resolver_context = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session
end

This means that if a read operation happens within the delay interval (here, 2 seconds), then even the read queries will be directed to writer database. This helps in keeping the write-then-read action atomic. The delay can be modifies as per db infrastructure.

This is it! We have successfully used reader/writer databases. But wait, how do we verify whether it is actually using those writer and reader databases in the correct places?

Let’s use ActiveRecord::Base.connection_pool.db_config.name to print the name of database it is accessing.

In tasks_controller.rb, I printing the above post running the Task.all query.

class TasksController < ApplicationController
def index
@tasks = Task.all
p "task index", ActiveRecord::Base.connection_pool.db_config.name
end

.
.
.
end

This prints the following in server logs

Voila! The index API (READ request) is using reader database. Now let’s check for writer. I have create api in my tasks controller.

class TasksController < ApplicationController
def index
@tasks = Task.all
p "task index", ActiveRecord::Base.connection_pool.db_config.name
end

def create
@task = Task.new(task_params.reject{|k| k=="category"}.merge({user: current_user}))
p "task create", ActiveRecord::Base.connection_pool.db_config.name

if @task.save!
category = params[:task][:category]
@task.categories.find_or_create_by!(name: category, resource_type: "Task")
p "task create save", ActiveRecord::Base.connection_pool.db_config.name

redirect_to tasks_path
else
render :new, status: :unprocessable_entity
end
end
.
.
.
end

This is what I get in server logs.

Note all 3 print statements. In this case, task.save and task.categories.create, both are write operations. Hence we are getting writer as the database.

But note another action — I am redirecting the create API to index action, and there is another print statement in index API. If you see, even the index(GET) query is coming up in writer database.

The reason for this is the delay of 2 seconds in auto role switch middleware config. Since this READ(index) operation happened within 2 seconds of the previous write operation, Rails maintained its guaranteed “read your own write” policy.

Now let’s test this delay by changing it to 2 ms. It is obvious that 2 ms is extremely short duration for read operation to get over. Wait for an interesting show!

So now, your multi_db.rb initializer looks like this.

Rails.application.configure do
# changing the delay to 2 ms
config.active_record.database_selector = { delay: 20.seconds/10000.to_f }
config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver
config.active_record.database_resolver_context = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session
end

When we run the create API again, we observe the following.

This time, even though index API rn along with create API, READ operation used reader database. This is because we reduced the delay to 2ms which was not enough for read operation to complete.

Let’s say we still want this index API to use writer database. Here we can implement ActiveRecord’s connected_to method with role as writing. Check out the response!

def index
ActiveRecord::Base.connected_to(role: :writing) do
@tasks = Task.all
p "task index", ActiveRecord::Base.connection_pool.db_config.name
end
end

That’s all! Rails’ database replica feature empowers developers to enhance application performance and resilience by efficiently distributing read and write workloads. Leveraging reader/writer replicas with thoughtful configurations provides a robust solution for managing large-scale databases, optimizing response times, and ensuring a seamless and reliable user experience.

--

--