Background Jobs in Elixir & Phoenix

I’m working on some side projects in Elixir & Phoenix, and have the need to run some work in a background process, for all the normal reasons (asynchronous with the web request, retries, and scheduling).

This is a summary of what I was looking for, and what I found from my searching.

What features am I looking for?

Asynchronous

This is the whole reason I want a job system in the first place. The goal is that I ask for a job to be done, then I trust that it will get done soon. The code that enqueues a job is not expecting to wait on the result of the background work before returning.

Durability

This is simply “Do I trust that this job won’t be lost”. Each individual type of job will have its own requirements for durability. A cache update may be entirely optional, while a 3rd party API call to enable an account may be vital what I’m doing.

  • If your server crashes, does the job get lost?
  • If your job hits a bug, does the job get lost?
  • If the external API returns an error, does the job get lost?

Retry on Failure

This is the other half of durability. Assuming you don’t lose requested jobs on failure, how can you ensure they retry.

Ideally libraries will have some flexibility around exactly how retries happen, maybe with backoff strategies, max retry counts, shifting the jobs to secondary (retry) queues, and so on.

Limiting

I’m cheap and only buy small VPSs and small database servers for my experimental projects. For scaling reasons, it’s likely we want to limit parallel jobs.

Queues

Different work may have different requirements, or priorities, or even specific servers they need to run on.

Example: A video analysis job that requires a 3rd party software licensed per-system, so you only have one license. You’d like any server to be able to inject a new analysis job, but only that one server to work on them. Meanwhile, any system should be able to send emails, and other more flexible jobs.

Reporting & Monitoring

Simple stuff like:

  • How many failures
  • How many successes
  • Depth of the queues
  • Time between enqueueing & when the job starts

What is out of scope?

General queuing tools

Tools like RabbitMQ, Redis and AWS SNS are all amazing, but they’re not designed to run background jobs.

Often, they will be building blocks of the systems that are more specialized background workers.

The jobs themselves

There are dozens of interesting libraries & APIs to send email, or resize images, or whatever task you really want to run in your job. Here, we’re focused on how we start and track the the work, not the work itself.

Working on an Elixir Project? We’re building Elixir App Monitoring at my day job which will keep an eye on your Phoenix App. Signup for early access.

Approaches & Libraries

Processes

All of Elixir & Erlang’s power is based on the idea of processes all running side by side, isolated from each other. A bare process, invoked by spawn is the underlying building block of most of the tools below. But it is so basic that you’d have to build an entire system to get retries and persistence on top of it.

Tasks

Elixir ships with a relatively simple Task library. This is a simple, but bare-bones way to disconnect a bit of work from the currently running code.

# Add to your application.ex's supervisor tree:
supervisor(Task.Supervisor, [[name: MyApp.TaskSupervisor, restart: :transient]])
# Then create tasks either with an anonymous function,
# or a module & function.
Task.Supervisor.start_child(MyApp.TaskSupervisor,
fn -> IO.puts "Long running task" end)
# Asynchronous version of calling:
# MyApp.Tasks.LongRunning.run(1,2,3)
Task.Supervisor.start_child(MyApp.TaskSupervisor,
MyApp.Tasks.LongRunning, :run, [1,2,3])

This code provides no retries, or backoff or any other features. If you use the start_child version (unlinked from the calling process), you won’t even be notified if & when the process completes or fails.

Of course, you could build a more robust system on top of the Task infrastructure, but that’s not the goal.

GenServer

The Task library is a special-case version of the more general GenServer. I’m not going to spend much time here, since the tradeoffs end up being very similar. A GenServer, launched via a Supervisor has the tools necessary to build a robust background work queue, but does not in-itself provide that functionality.

Exq

Exq is a library that claims to be compatible with Ruby’s Sidekiq library. It stores jobs outside your application, serialized as JSON into a Redis database. Then it has a watcher of those Redis queues, popping work off, and working on it (in parallel of course).

The benefit of being compatible with Sidekiq’s format is that you can migrate a Rails application slowly, moving jobs over to an Elixir backend.

Website: https://github.com/akira/exq
ElixirCasts Guide: https://elixircasts.io/elixir-job-processing-with-exq

Toniq

Similar in many ways to Exq, but uses native Erlang serialization, instead of serializing json into & out of Redis. So Toniq is not compatible with any existing Sidekiq or Resque you may have from existing Rails projects. Of course, that doesn’t matter for new systems.

One interesting difference from Exq is that instead of using Redis as the path that all jobs pass through, it only uses Redis as a backup mechanism, and keeps the job definitions in the running VM. Redis can be used to repopulate the VM’s work queues when needed (after a restart for instance). But because the work is stored locally in the VM, it isn’t automatically load-balanced to all VMs also connected to that Redis the way that Exq or Verk would.

Website: https://github.com/joakimk/toniq

Verk

Verk, like Exq uses Redis as its backend, with a Sidekiq compatible format.

It uses some fancy Lua code pushed to Redis that allows for a more robust “never lose a job” behavior, and Verk’s README emphasizes the separation of each queue into its own Supervisor, for additional levels of reliability.

Website: https://github.com/edgurgel/verk

My Choice

After looking at the tools, I think Exq is the first one I’m going to investigate. The difference between the three libraries here is pretty small, so this is a mild preference, but Exq offering its web interface is enough to make me try it first. I’ll update this post as I implement.

Working on an Elixir Project? We’re building Elixir App Monitoring at my day job which will keep an eye on your Phoenix App. Signup for early access.