Prefect + Dask

or: How I Learned to Stop Worrying and Love Distributed Computing

Christopher White
The Prefect Blog
Published in
2 min readApr 15, 2019

--

Since Prefect was open-sourced, one of the most common questions has been:

How do I deploy a Prefect flow?

The Prefect engine has simple hooks for configuring any deploy model you prefer, and we’re doing a lot of work to make flow storage, deployment, and execution in custom environments even easier. These are the exact same hooks that we use in Prefect Cloud (sign up for the preview here!), but if you want to dive in yourself, you can take advantage of them today.

To help you get started, I’m going to write a series of posts to answer your deployment related questions, starting with:

How do I run Prefect in a distributed Dask cluster?

Prefect + Dask

We love Dask, and you should too!

Dask is a system for distributed computing that scales seamlessly from your laptop to immense clusters. Prefect was designed for Dask, and it’s the default executor in Prefect Cloud. When running on Dask, Prefect tasks can launch asynchronously, with millisecond latency, and run in parallel (up to the number of workers you have). In fact, once you’ve used Prefect with Dask, running workflows any other way will feel… prehistoric.

Launching a Dask Cluster

To get started with Dask right away, simply launch a cluster with two workers on your local machine (everything you need comes installed with Prefect):

CLI commands for spinning up a local Dask cluster

Configuring Prefect

Configuring Prefect to run with Dask is as simple as setting two environment variables, one enabling the Dask executor and another pointing at the cluster:

Set the necessary environment variables to point Prefect to your Dask cluster

Now you can fire up Prefect, and any flow you run will execute in your cluster!

Further Reading

Check out the full Dask tutorial in our docs to learn more, including:

  • code for all examples
  • how to save flows in scripts and execute them from the command line
  • how to automatically run flows on a schedule

Happy engineering!

--

--