Making Life Easier with Distributed Background Jobs

Managing Ecommerce Apps at Scale - Part 1

Published in

Glazed Dev

5 min readJan 29, 2021

One of our projects involves maintaining and developing features for the online store of one of the biggest grocery store chains in Portugal. In order to manage and get reports about the operation, business managers can export large files containing products, orders, and other critical business data. Depending on the parameterization, these tasks can become time and resource expensive.

For this reason, instead of performing the process synchronously, we initially executed these jobs in the background and, once completed, sent the result to the manager’s inbox. Executing these tasks in the background was already a significant improvement since we would not block Node.js’ main thread while we processed the manager’s request.

As the business scaled (and COVID was a big booster), these operations still posed an issue as the number of times the tasks were requested and data they encompassed grew. This would increase the load on the system, sometimes unnecessarily, because those multiple requests might have been unintentional. In this scenario, the machine receiving the manager’s request would be the one doing the work, not taking advantage of other machines that might be experiencing less load (since we had no means of sharing work between them).

The solution we were looking for needed to fulfill the following requirements out-of-the-box:

We need to be able to distribute the load across the machines available
We need to limit how many times the task executes to ensure system stability

If your task execution rate is low, distributing the load across the system may not be needed. In our scenario, it proves to be useful since we can scale our system automatically. We want to handle multiple tasks from different managers, and if the load surpasses a certain threshold, we will spin up a new machine to cover the needs.

Allowing multiple tasks to be requested without limitation caused spikes and performance issues in our system. Limiting how many tasks of the same type can be executed lets us handle the load better, especially if tasks are not urgent. We want to avoid managers running the same export more than one at a time.

We ended up choosing a distributed background tasks engine to handle our problem, and although we could’ve settled for another solution, it has the following advantages:

By introducing an engine that’s shared across the system, tasks can be executed by different machines (whereas previously, machines had no way of managing this by themselves)
With little configuration, we can limit the number of tasks being executed concurrently, either by type of task or by who is requesting them

Thus, Resque to the rescue! 😉

So, what’s Resque?

Resque (pronounced like “rescue”) is a Redis-backed library for creating background jobs, placing those jobs on multiple queues, and processing them later.

It was developed by Github back in 2009 because of necessity.

Github needed to perform massive background work and couldn’t find a solution to cover their needs, so they chose to create their own solution. Nonetheless, before starting to develop their answer, they first tried different background job systems. They concluded that all of them had some type of limitation — performance, latency, memory — which caused them to be discarded.

Initially developed in Ruby, it soon became a trend within the community which naturally led to its appearance in other languages like java or javascript.

If you are interested in Reque’s beginnings and want to know more, check this blog post.

In our project we used node-resque, a node.js implementation of Resque.

Concepts

Now that you have some context about Resque, let us introduce you to some basic concepts you should know before using it. We can decompose the entire system into 4 self-explanatory pieces:

Queue - responsible for enqueueing/dequeueing our jobs

Worker - responsible for running our jobs. One per Node.js process by default

Scheduler - responsible for promoting enqueued delayed jobs into the work queues when it is time

Job - the actual task we want to run

Setup

To start using Resque, the first thing you need to do is to set up a connection to your Redis database:

Then you can define your jobs that will be executed by your workers:

There are some plugins available you can use out of the box. One of them is QueueLock, which blocks enqueueing jobs with the same name and arguments if one is already present in your queue. Another one is JobLock, whose responsibility is to prevent multiple jobs with the same identifier to be running at the same time (by re-enqueueing them to be executed sequentially).

By default, these plugins will use job name, queue, and args to be uniquely identified, but you can also specify your key based on your needs. In the configuration example below, we are using the first and second args (catalog ID and email) to create the job identifier.

Once jobs are defined, start your queue, worker and scheduler:

The next step should be to ensure a graceful ending:

This helper will properly clear your worker status from Resque when shutting down your application.

You can also subscribe to worker and scheduler events.

Your initial setup is complete! You can start enqueueing your tasks:

Or if you want to schedule for execution in the future, then you do:

At this point, you should be able to start running your background tasks.

Since Resque is backed by Redis, distributing your background tasks comes with no effort. Make sure to apply the same setup to all your hosts.

In Conclusion

Resque, in conjunction with the JobLock and QueueLock built-in plugins, made our life easier. QueueLock will avoid duplicate jobs in the queue so managers cannot request the same export multiple times. JobLock will make sure that only one export job executes for the same manager.

Besides, it has the nifty perk of letting us scale the system without worrying about these tasks consuming too many resources.

We hope you enjoyed this glance at the setup and execution of Resque in Node.js. We showed how Resque combined with built-in plugins allowed us to solve our problems with little effort leaving us more time to focus on the task itself.

In the next article, we’ll focus on how smart database usage lets us keep queries performant and our system responsive.

Thanks for reading, if you enjoyed our content you can find more over at our Twitter, Facebook and LinkedIn 👋