Asynchronous processing using SQS and Shoryuken

Wolox Engineering
Wolox
Published in
3 min readOct 15, 2015

Usually when a Rails application needs to manage asynchronous processing, developers use Sidekiq with its awesome dashboard to manage the failures and retries. It’s simple to setup and works like a charm when using either Heroku or Amazon EC2 with Capistrano. At Wolox, we needed to scale an app that makes lots of webpage scrapping using Sidekiq.

For our app to scale we needed lots of dynamic EC2 instances to make asynchronous processing. But using Sidekiq with one Redis database didn’t seem like the right choice when we thought about scalability. This caused us to think outside of the box which lead us to SQS queues, which appeared to be a better option due to its high availability, distributed solution and cost effectiveness. Basically instead of saving the asynchronous requests data in the Redis database, we send an SQS message and then by polling we get those requests and finally continue to the processing stage.

See also: Kaizen ‘改善’

This looks like a real pain, but it’s very easy to implement using Shoryuken. With a syntax and setup similar to that of Sidekiq, forget about Redis and start using Shoryuken with SQS queues. We started using it with one EC2 instance, and it looked promising, but there was one thing worrying us…Shoryuken lacks a dashboard to help us understand what is going on with the processing.

Because our app generates a lot of scrapping, loads of random errors appear as a result of website changes. When scaling, we thought about 10, 1000, 10 000 times more processing than before, that means lots of new errors and the SQS console doesn’t give you any detailed mechanism to audit this. So what we did was implement a new dashboard, in a similar way Sidekiq does using a middleware that saves information in a Redis database.

But… as we said before, it didn’t seem like a good idea to use Redis however we found ourselves in the same predicament. The thing is that the data saved in Redis only works as an audit and does not handle all of the main message managements, for this we rely on Amazon.

So, how does the dashboard work?

The middleware for Shoryuken processes executes when:

  • Data for process X is enqueued in SQS
  • X starts
  • X fails

The middleware code may look something like this:

If you want to send some information to Redis from the middleware, you can do something like this for failed jobs:

Where REDIS is the connector for the Redis database.

See also: Image Processing in Rails using AWS Lambda

In each middleware state, what we did was send information about the process status to the Redis queue with different keys depending on the information sent. On the other side, we made a simple Rails app, that the only thing it does is fetch information from the Redis database and show the data in a more visually appealing way by using AdminLTE.

With a few controllers, a few lines of JQuery and any new CSS, you have a pretty dashboard where you can see all the failed jobs, retry them, delete them or manage all your distributed EC2 instances if necessary. If in any future we want to rely on our data in some other service, we can change our data connector (instead of using Redis) and everything keeps running smoothly.

Our first version of the dashboard looked like this:

Pretty awesome and overall it’s quite simple to implement…what do you think ?

Posted by Esteban Pintos (esteban.pintos@wolox.com.ar)

www.wolox.com.ar

--

--