Hacking Celery to Write a Code Formatting API

Paul Bailey
4 min readSep 30, 2018

Code Format API Docs: https://www.neutron64.com/help/code-format

Recently I decided the world needed a code formatting API. Code formatting is all the rage recently with tools like gofmt, prettier, and black. However, I wanted to be able to format code without having to worrying about installing all the different tools. So I wrote a code format API. Can you make a HTTP request? Then you can format some code.

The API takes in code and then routes it to the appropriate tool for formatting. My backend is written in Python and I was already using Celery for a distributed task queue. I could simply execute the tool on the command line and send in the code via stdin. However, I wanted to see if could have a tighter integration that wouldn’t need to have to constantly reload code on the command line. So what if I put tasks on to the Celery queue and have different languages execute the tasks? Celery really isn’t designed for this. There are some other implementations for putting tasks on the queue in other languages, but there is no real protocol for executing tasks in languages other than Python.

Separate Queues For Different Languages

The first problem was making sure each language picked up only tasks that it could handle. The easiest way to do this was to make each language worker work on a separate queue. So when adding tasks I made sure to route them to the correct queue like so:

# python code
async_result = celery_app.send_task(
'prettier_format',
args=[code, parser, options]
queue='prettier'
)

By default Celery serializes tasks into JSON. Since JSON is pretty universal now, this makes it easy to pick up the task and run it. Also by default Celery uses RabbitMQ for a queue. Since I also wanted easy access to the queue in many languages, I decided to go with Redis instead. While not as featureful in Celery as a queue, it is good enough and is has good clients written in many more languages.

Javascript Celery Worker

Javascript was the first worker I wanted to implement since I wanted to implement prettier. Prettier supports a wide variety of code formatting so I knew it would give me a good feature set without having to implement many different tools.

I needed to solve two problems:

  1. get a task off the queue
  2. return the results

To get a task off a Redis queue you use the pop command: redis_client.blpop(‘prettier’, 0, process_prettier);. That line does a blocking pop and executes the callback process_prettier on the data. When processing that queue’s data we need to de-serialize if from a string.

// JS Code
var data = JSON.parse(data_string);
var args = JSON.parse(atob(data.body))[0];

Notice you have to de-serialize twice. Once to get the job data, then a second time since the arguments to the job are also serialized. Also notice the arguments are also base64 encoded. This is done if you are using binary data in one of your arguments.

You now have your input and can do processing. So you’ll want to store your results when the processing is complete. To do this you need to structure the data in a way Celery knows how to process and both publish and set the results. See code below.

// JS Code
var id = data.headers.id;
var key = `celery-task-meta-${id}`;
result = {
status: "SUCCESS",
result: result_data,
traceback: null,
children: [],
task_id: id
};
result = JSON.stringify(result);
redis_results_client.set(key, result, 'EX', 60);
redis_results_client.publish(key, result);

After you set and publish the results, your Python code will pick up the results with a async_result.get().

Conclusions

You can see it only takes a few lines of code to hack together a Celery worker in another language that works. Most of the hard work in doing this is exploring how Celery implements it’s queue protocol. By following the code above you can now skip some of that investigation.

The downside to this hack is it doesn’t support all the features of Celery like retrying, error reporting properly, etc. However, you can get a stable worker that can handle lots of work and can be scaled.

I would like to formalize a more official Celery Javascript worker in the future. If this is something you think you would use, leave me a comment and maybe we can collaborate.

Also let me know if you would ever use a code formatting API: https://www.neutron64.com/help/code-format

--

--