Comparing the performance of RoR and Go workers in the background processing. Part 1 — RoR API and worker.
This is the first of two articles which will lead You through the creation of Ruby and Golang workers that perform a similar task and log the duration statistics. Also, we will write API to interact with these workers. The workers should solve the next abstract problem:
Given a big number of survey results about some product. The mark is stored in the database by the field named ‘value’.
The task is to get a page with results (by page and per_page). Calculate min, max and average value among the items presented on the page. Create a new survey entry, storing the average of these 3 values.
Extra: Although it is possible to tune the above operations, we will be using straightforward collection operations for better comparison purpose. Also, we’ll track the time necessary to process each page and store these statistics in a separate entry.
Notice, that Ruby VS Go performance comparison is already present on the page of this package. Besides comparison, these articles also describe the building and integration of 2 apps. If you are here for comparison results only, navigate to the end of Part 2. The source code of API and ruby-based workers is available here. The source code of go worker you can get by this link.
So, let’s start to implement our API. Here are 10 simple steps to set up API server with Sidekiq-based background processing.
1) Create a new app using rails generator. The pure API will be sufficient enough to represent the results of our work
rails new rails_survey_calculator --api
2) Gems. We will use Postgres as our database and Sidekiq for background processing. Also, setting secrets from ENV is much more convenient than from config files. So let’s add the corresponding gems to the project Gemfile.
gem ‘dotenv’
gem ‘dotenv-rails’
gem ‘pg’
gem ‘sidekiq’
and launch bundle
.
3) Set the env variables necessary for the application. Here is the sample content of .env.development file
export POSTGRES_USERNAME=postgres
export POSTGRES_PASSWORD=better_generate_your_own_password
export POSTGRES_HOST=localhost
export POSTGRES_DATABASE=survey_calculator_development
export POSTGRES_PORT=5432
export REDIS_URL=redis://localhost:6379/0
export WEB_CONCURRENCY=1
export RAILS_ENV=development
export LOG_LEVEL=debug
export SECRET_KEY_BASE=generate secret and insert it here
4) Our .yaml files should be updated to consume config variables from ENV.
config/database.yml
default: &default
adapter: postgresql
encoding: unicode
pool: 5
timeout: 5000
username: <%= ENV[“POSTGRES_USERNAME”] %>
password: <%= ENV[“POSTGRES_PASSWORD”] %>
host: <%= ENV[“POSTGRES_HOST”] %>
port: <%= ENV[“POSTGRES_PORT”] %>development:
<<: *default
database: <%= ENV[“POSTGRES_DATABASE”] %>production:
<<: *default
database: <%= ENV[“POSTGRES_DATABASE”] %>test:
<<: *default
database: <%= ENV[“POSTGRES_DATABASE”] %>
config/secrets.yml
development:
secret_key_base: <%= ENV[“SECRET_KEY_BASE”] %>
test:
secret_key_base: <%= ENV[“SECRET_KEY_BASE”] %>
production:
secret_key_base: <%= ENV[“SECRET_KEY_BASE”] %>
config/initializers/sidekiq.rb
sidekiq_config = { url: ENV['REDIS_URL'] }
Sidekiq.configure_server do |config|
config.redis = sidekiq_config
endSidekiq.configure_client do |config|
config.redis = sidekiq_config
end
Because config files don’t contain any sensitive data, it’s ok to not include them into .gitignore.
At this point, our application should render the greeting page without errors.
5) Create migrations
To store task details:
rails g migration CreateTasks page:integer per_page:integer
To store survey results, which values we will process:
rails g migration CreateSurveyResults value:float task:references
To store statistics of task processing:
rails g migration CreateStatistics task:references handler_type:string collection_size:integer duration:float
Now we can create the database and launch the migrations:
rails db:create && rails db:migrate
6) Create models, add pagination to Survey results
app/models/survey_result.rb
class SurveyResult < ApplicationRecord
scope :page, ->(page, per_page) { limit(per_page).offset(per_page * ((page = page.to_i — 1) < 0 ? 0 : page)) }
end
app/models/task.rb
class Task < ApplicationRecord
has_many :statistics
end
app/models/statistic.rb
class Statistic < ApplicationRecord
belongs_to :task
end
7) Add endpoints to create tasks and view statistics:
routes.rb
Rails.application.routes.draw do
namespace :api do
namespace :v1 do
resources :tasks, only: :create
resources :statistics, only: :index
end
end
end
end
app/controllers/api/v1/tasks_controller.rb
module Api
module V1
class TasksController < ApplicationController
def create
@task = Task.create(task_params)
render json: {
message: “Task ##{@task.id} successfully created.”
}
end private def task_params
params[:task].permit(:page, :per_page)
end
end
end
end
app/controllers/api/v1/statistics_controller.rb
module Api
module V1
class StatisticsController < ApplicationController
def index
@task = Task.last
render json: @task.statistics.as_json(only: %i[handler_type collection_size duration])
end
end
end
end
At this point, we should be able to submit a request to tasks endpoint and see the message about successful creation. Besides standard tools like curl or Postman, you can use this web-ui client (see README.md for installation and usage guide).
post: localhost:3000/api/v1/tasks
{
“task”: {
“page”: 1,
“per_page”: 1
}
}
Response
{
“message”: “Task #1 successfully created.”
}
8) Implement the handler
app/workers/ruby_calculator.rb
class RubyCalculator
include Sidekiq::Worker def perform(task_id)
# Task processing time start
time = Time.now
task = Task.find(task_id)
page = task.page
per_page = task.per_page
db_page = SurveyResult.limit(per_page).offset(per_page * ((page = page.to_i — 1) < 0 ? 0 : page))
return if db_page.count.zero? total_avg = page_avg(db_page)
SurveyResult.create(task_id: task_id, value: total_avg) # Task processing duration
duration = Time.now — time Statistic.create(task_id: task_id, handler_type: ‘ruby’, duration: duration, collection_size: db_page.count)
end def page_avg(db_page)
min = max = sum = 0 db_page.find_in_batches do |batch|
batch.each do |survey_result|
max = survey_result.value if survey_result.value > max
min = survey_result.value if survey_result.value < min
sum += survey_result.value
end
end avg = sum / db_page.count (min + max + avg) / 3
end
end
9) Enqueue new job when new task is added
app/models/task.rb
class Task < ApplicationRecord
has_many :statistics
after_create :process_in_background private def process_in_background
RailsCalculator.perform_async(id)
end
end
10) Add a rake task to create 100000 survey result entries.
lib/tasks/db/populate_survey_results.rake
desc ‘Populate survey results’
namespace :db do
desc ‘Generates 100000 survey results’
task populate_survey_results: :environment do
records = []
(1..100_000).each do |time|
value = rand(1000)
sr = { value: value }
records << sr # Log results and save every 5000 records
next unless (time % 5000).zero?
puts “Builded #{time} entries”
puts ‘Saving…’
SurveyResult.create records
records = []
end
end
end
10) Check what we have done. At this point, we will launch the API and background processing by running commands rails server
andsidekiq
in separate tabs.
Let's see how it performs at this moment. Create a request to tasks endpoint to process 10 entries.
post: localhost:3000/api/v1/tasks
{
“task”: {
“page”: 1,
“per_page”: 10
}
}
Let’s see the statistics for handling this task (Actual for MBP 15 2015, default Sidekiq settings)
get: localhost:3000/api/v1/statistics
[
{
“handler_type”: “ruby”,
“collection_size”: “10”,
“duration”: 0.011306
}
]
At this moment all RoR API and worker preparations are done. Proceed to the Part 2 for detailed Go worker setup and test results.