Using Celery at HyperTrack

Arjun Attam
HyperTrack
Published in
3 min readApr 11, 2016

Our goal in devops at HyperTrack is to reduce the request response lifecycle of API calls and make them super fast. To achieve this goal, our architecture offloads some tasks out of the synchronous request lifecycle of the application. We accomplish this with Celery, which has become a crucial component in our tech stack at HyperTrack, to execute asynchronous tasks.

The need for Celery

Most of our foreground API requests are create-read-update-delete (CRUD), and these are I/O bound on our database. Therefore, our application server has an event-based architecture which does not block while waiting for an I/O response. In contrast, some of our background tasks, like the ETA models or location filtering, are CPU intensive, and running them on the application server causes the event loop to block, and increase the request lifecycle for the end user.

To tackle this challenge, we setup a task queue to offload some of these jobs from the request lifecycle, and move them to the background. We chose Celery for this purpose. Celery spawns worker processes to consume and execute background tasks. These workers communicate with the application server via a broker, like RabbitMQ or Redis. We found these best practices useful while setting up Celery at HyperTrack.

Splitting tasks into queues

By default, Celery uses a common queue to route all tasks through the broker. While this is easy to setup, it treats all tasks the same, which can lead to suboptimal performance. At HyperTrack, our background tasks can be segregated.

  1. The first set of tasks is used to clean and filter location data. These tasks have a lot of math, and are therefore, CPU intensive. A thread based architecture is best suited for these tasks.
  2. The second set of tasks is used to define models and fetch data for the ETAs. The performance of these tasks is determined principally by the time spent on database queries and external APIs. These I/O bound tasks perform best with an event based architecture.
  3. In addition, we also have a set of periodic tasks for maintenance, that form the third set. This set of tasks is also I/O bound on our database.

These categories of tasks help us define our Celery queues.

With different Celery queues, we can use the best worker processes to execute these tasks. CPU bound tasks are routed to a worker with a preforked process pool. I/O bound tasks are executed by a gevent based event driven worker. We use the following commands to spawn these processes.

Task time with combination

Benchmarking worker performance

With some stress testing, we compared the performance of prefork and gevent workers, independently and in combination. Performance is measured as the time taken for a task to complete.

Task times with prefork only
Task times with gevent only

In our tests, the combination of prefork and gevent workers is 4 times faster on average than individual workers. When used independently, the prefork worker blocks consumption of new tasks while waiting for I/O bound tasks to finish, and so the time in the queue piles up. The gevent worker shows higher execution times as CPU bound tasks block other tasks in the execution thread.

Conclusion

Celery helps us take off some load from the application server, and which helps us serve your requests fast. If your application has tasks that can move outside the request lifecycle, we definitely recommend Celery. We have had our share of misadventures setting up Celery, and we will share our learnings in upcoming posts. Stay tuned.

And if your application can use location tracking, we definitely recommend checking out our APIs. Request access here.

--

--

Arjun Attam
HyperTrack

Program Manager on Playwright at Microsoft. I ❤ developer tools!