Background Job in Rails Using Rabbitmq and Sneaker

A study case of handling 10k+ row update by using csv as an input

If you are new to Back End Engineering, you might think that API flow is as simple as request, process then response. Well, at least that’s what I thought when I decide to start new journey as a Back End Engineer around 6 months ago. But, unfortunately, the reality is not that simple as you might already know. There are some case when the process part of an API is so heavy that it takes seconds or even minutes just to response a single request. This is of course a very bad experience for user especially at peak hours where there might be hundreds or thousands users who access your API at the same time. I mean you will never use Google if it takes at least a minute to run a single search query and there are frequent request timeout right?

One of my early tasks as a Back End Engineer require me to deal with this kind of scenario by using background job, in which instead of you wait for the process to finish before giving a response, you directly give a response and left the process on the server to be processed later. In this article I will share what I learn about background job by using a study case of an admin who want to bulk update records in database.

Epilogue

Before we dive into the code, I want you to have a clear grasp of the challenge that we will try to solve using background job. You need to understand that when dealing with logic heavy API, response time is the most merciful enemy that you might encounter compared to other problem such as unexpected behaviour that will mess up your database.

So, now, let’s imagine that you have an admin dashboard that is used to update user’s balance and the admin want to increase or reduce hundreds of user’s balance in daily basis. Instead of torturing the admin by asking them to manually update the record one by one, we ask the admin to create list of the balance mutation in csv, then ask them to upload it to our backend and let our magic to handle the rest. Here is the example code that we have to handle it:

It might not be a problem if we just want to update 100 or so records, but when the admin want to update thousands of record, our magic will turn into curse that will mess up our database in unpredictable manner. Sometimes it would be as simple as failure to update the balance which left the user balance untouched but in another unlucky time it might create multiple update to the user’s balance instead of once which of course beneficial for those whose balance is increased but at the same time will trigger massive waves of complain from user whose balance is reduced.

Install Rabbitmq using Docker

Rabbitmq is a message-broker software which basically means that it act as a broker that manage where our data (message) is delivered, handle queue retrying etc. This is where we temporarily store our data, before sneaker consume it and execute the background job. As alternative, you can use Sidekiq or Kafka. I prefer to use Rabbitmq which offers more flexibility compared to sidekiq which require you to buy premium licence if you want to use certain feature and provides more elegant way to deal with simple background job compared to Kafka which is better at more complex and heavy job, especially the one that deal with big data.

First, let’s install docker. Why? Trust me, your life as back-end engineer will be better once you familiarize yourself with docker. In our case, using docker will help us to install Rabbitmq easier regardless of OS that you have. You can follow the instruction here to install docker. To verify docker installation , you can run docker run hello-world which will pull and run docker image called hello-world. Here is how it looks like if you successfully install docker:

After that, now it’s time for us to prepare the installation of Rabbitmq using docker. Let’s create new folder and new file called docker-compose.yml inside it. This file is basically the one that will tell docker which image it should pull and the configuration it should use when we run the image. Here is the content of docker-compose.yml file:

Finally run docker-compose up to download Rabbitmq image and at the end of the installation, docker will run Rabbitmq in your console. You can kill the process using Ctrl+ and if you want to start it again you can use docker-compose start instead to only run docker image without pulling it. To check running container, you can use docker ps and you will see that rabbitmq is up and running properly at port 5672 as follow:

By the way if you want to stop docker that you run with docker-compose start you can use docker-compose stop.

Now you can imagine if we not only need Rabbitmq but let’s say we also need MySql, Elasticsearch etc, it will be really painful if we have to install each of them separately and run them one by one each time we want to use them. In my case, without using docker, I frequently experience situation where I found an error in my code simply because I forget to run one of the service required for my API to run. I hope you don’t repeat my stupid mistake.

Install Rabbitmq Plugin

Now this part is optional, but in the future you might need a tutorial on how to install a Rabbitmq plugin. For those who don’t really need it, you can skip to the next part. But if you think you might need it, let’s try to install rabbitmq_delayed_message_exchange plugin which will enable us to delay message as follows:

  1. Go to bash in rabbitmq docker by using docker exec -ti {CONTAINER_ID} /bin/bash where you can see container id by running docker ps
  2. Go to plugins directory and download the plugin by using following command:
> wget https://dl.bintray.com/rabbitmq/community-plugins/3.7.x/rabbitmq_delayed_message_exchange/rabbitmq_delayed_message_exchange-20171201-3.7.x.zip
> unzip rabbitmq_delayed_message_exchange-20171201-3.7.x.zip
> rm rabbitmq_delayed_message_exchange-20171201-3.7.x.zip

3. Finally, you can run rabbitmq-plugins enable rabbitmq_delayed_message_exchange to enable the plugin

Now, Let’s Refactor our Code ( Yeay :| )

Let’s be honest, as developer refactor is part of the job that we hate the most but unfortunately, it shall be half part of our job to maintain our sanity when the next time we read our own code. In our case let’s start with initiating connection between our Rails app with Rabbitmq using gem called bunny. Add gem 'bunny', '>= 2.9.2' to our Gemfile and then run bundle install. After that create new file rabbitmq_initializer.rb in config/initializers/ as follow:

For BUNNY_AMQP_ADDRESSES is where our Rabbitmq is running, usually at localhost:5672, for user and password you can use default login for rabbitmq which is guest | guest then for vhost, you need to create it by following command:

  1. Open rabbitmq bash again using docker exec -ti {CONTAINER_ID} /bin/bash
  2. Run rabbitmqctl add_vhost railsmq to create the vhost
  3. Run rabbitmqctl set_permissions -p railsmq guest ".*" ".*" ".*" to set the vhost permission to our default credential

Publishing Message to Rabbitmq

Now that we have configure the connection to Rabbitmq, it’s time for us to publish a message to Rabbitmq by creating publisher class in publishers/mass_update_publisher.rb as follow:

QUEUE_NAME in the code above will be the routing_key for our message. This routing_key is important to decide which queue our message shall be sent to. In our case the message that we want to send is mutation of user balance. These mutation will be stored in the queue until there is service that consume it. That is what actually our API do which is send a message to queue to be processed later and not process the mutation directly.

Consuming Rabbitmq Message using Sneaker

After we publish the message, we need a service that will consume the message and do the actual calculation and process. To do this we will use gem called Sneakers , by adding gem ‘sneakers’ to our Gemfile then run bundle install. Another alternative to consume message is to use Bunny gem that we install in the previous section, but I prefer to use Sneakers since it has more specific purpose which is to consume a message as a background job. FYI, behind the scene Sneakers is also using Bunny to consume the message.

Just like when we use Rabbitmq to store a message we also need to initiate sneakers by first addrequire ‘sneakers/tasks’ to file called Rakefile in our rails root directory. After that we need to initiate the connection between our rails app and sneakers by using config/initializers/sneakers.rb as follow:

After that, to consume the message in user.mass_update_using_csv we will create file worker/mass_update_worker.rb as follow:

Include Sneakers::Worker is the magic in this code. This class is the one that will make the worker standby to receive message from the queue. In this code we use the queue name from the publisher to ensure that we get the message from the correct queue which is user.mass_update_using_csv queue. Function work is the main function that will be executed when sneakers consuming message from queue which will be msg parameter of the function.