Full Source Code: https://github.com/MoonVision/django-dask-demo

Overview

Web servers require asynchronous task execution for longer running computations. A common solution to this problem is task queues like Celery. As it has become easier to be able to store more data in recent years, data scientists are needing more and more computation power to process all their data. Dask was introduced to help them with this. From the docs, “Dask is a flexible library for parallel computing in Python.” Let’s look at how we can use Dask behind our Django Rest Framework web server to process asynchronous tasks.

Dask Setup

Dask can run on a single machine or a distributed system with thousands of cores. To use in a cluster, it uses a scheduler to coordinate worker nodes in executing your task. This setup allows you to supply a preload script for your worker that will be run before the worker starts. We will use this script to setup the django environment on our workers. At the time of writing you install Dask with the “dask” pip package and the distributed system parts of Dask with the “distributed” pip package. See the official docs for current info. …

Matt Nicolls

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store