A Scheduler for the Internet

Jeremiah Lowin
The Prefect Blog
Published in
5 min readOct 7, 2019

--

Today, we’re excited to announce Prefect Scheduler, a free workflow management system for individuals that supports any execution environment. Scheduler is the product we’ve always wanted to build, combining Prefect Core’s ease of use with Prefect Cloud’s production-ready infrastructure. With Scheduler, users just call flow.deploy() and workflows are immediately prepared for execution on any machine, on any schedule, at any time.

You can sign up for Scheduler here.

As a company, Prefect is dedicated to eliminating negative engineering. Our open-source engine, Prefect Core, is powering workflows from data science bootcamps to the largest companies in the world. Our orchestration console, Prefect Cloud, is delivering confidence in every industry from e-commerce to professional sports. And now, Scheduler will give all of our users the easiest possible path to deployment.

Why did we create Prefect Scheduler?

Our goal with Scheduler is to provide any user of our software with a turnkey workflow orchestration system. If you’ve written a flow in Prefect Core, Scheduler lets you benefit from Prefect Cloud’s most popular features — including its UI, API, and, of course, its scheduler — with a single line of code, for free.

Scheduler is the most pure expression of the difference between what we make and what we sell. We make (in fact, we open-sourced!) the best engine for building and running workflows. And we sell an automation platform that delivers workflow confidence. Now that Cloud is humming along, we can borrow its infrastructure to deliver an enhanced experience to our open-source users.

Running your custom code on a schedule is really, really hard. We’re unaware of a good solution that doesn’t involve handing code over to a third party or setting up and hosting an entire platform yourself. Since Prefect Scheduler is a lightweight way to run code in any private environment, either on schedule or with a simple API call, we hope to finally offer a simple solution to a complex problem.

If you need to schedule complex workflow logic, or even a single script, you should use Scheduler. Compared to using Prefect Core alone, you’ll gain asynchronous, distributed execution and full visibility into the health of your system, all from one line of code. Thanks to its simplicity and universal access, we hope it becomes the scheduler for the internet.

How does Prefect Scheduler work?

To reiterate: running code on a schedule is deceptively hard… but it seems like it should be easy. As we’ve written, “This should be easy” is a telltale sign of negative engineering. It’s no surprise, therefore, that legacy workflow software has evolved a convoluted solution to this problem.

Most workflow systems take one of two approaches to scheduling: they either outsource the work to an inflexible piece of software like cron or Windows Scheduler, introducing a brittle single point of failure; or they expect users to spin up and maintain a massive stateful infrastructure, consisting of databases, webservers, schedulers, and workers.

Like most modern data scientists and engineers, we reject both of these approaches as archaic. Why should you need a dedicated database just to run an analysis on the last business day of the quarter?

At Prefect, we’ve come up with a better way: our Hybrid Platform, which is fully supported by Prefect Scheduler. With our hybrid execution model, Prefect manages the orchestration infrastructure — the database, the UI, the API — while users keep their code and data completely private. When it’s time to run your workflow, a small open-source Agent deploys it on whatever laptop, cluster, or platform you want, coordinating all resulting state updates with Prefect Cloud to ensure workflow semantics are respected. This system works no matter how many concurrent, distributed agents are running and lets you keep full control of your workflows. Only small amounts of metadata are sent to our servers, and users can opt to further constrain communication if necessary. Essentially, the only thing you need to provide is an execution environment, which can be as simple or complex as your workflows require.

We originally developed the Hybrid Engine to meet the privacy and security requirements of some large hedge funds. However, it works so well that once our other Cloud partners learned about our version of “on-prem lite,” every single one chose it over the fully-managed service we were planning on offering. The system is so innovative that over the last year Prefect has filed multiple patents to cover its functionality, and we will not be offering a fully-managed service at all. Frankly, who would choose it?

Today, the hybrid model supports four modes of execution:

  • any server with Docker installed (we’ve run this on a Raspberry Pi!)
  • dedicated Dask clusters
  • Kubernetes clusters that automatically launch per-run Dask clusters
  • AWS Fargate

We’ve been sneaking this functionality into Prefect Core for months and now that it’s breaking cover, we hope users will help us expand execution to any imaginable environment.

A system this powerful might seem difficult to use, but all you have to do is run prefect agent start. We’ll handle the rest. Specific deploys to AWS or GKE are just as easy: prefect agent start fargate and prefect agent start kubernetes, respectively.

In fact, when you first log in to Scheduler, you’ll go through a tutorial that shows you how to connect your laptop to the Prefect hybrid platform. You’ll click a button on our UI and watch a flow execute right in your local terminal. The whole thing takes about 60 seconds, start to finish.

What can Prefect Scheduler do?

Any flow you write with Prefect Core can be deployed to Prefect Scheduler. If you assigned a schedule to the flow, it will automatically execute on time; or you can always run it off-schedule via the UI, CLI, or API.

Naturally, Prefect Scheduler fully supports our new custom schedules. Trading days; month ends; next business day after the quarter; alternate side parking: if you can code it, Prefect can automate it. You can even customize how daylight savings time is handled. And yes, Prefect can parse cron strings (if you really miss the 80's).

In the introductory period, Prefect Scheduler will allow any user to run 5,000 successful task runs per month — that’s more than 160 tasks per day. We only count successful runs against the limit, so users should feel free to experiment with their workflows. Thanks to the Hybrid Platform, you can run tasks of unlimited duration or concurrency. If you require more but aren’t ready for Prefect Cloud (which starts at 10,000 runs per month and scales to millions), please contact us. We also have a special program for non-profits and open-source projects.

Scheduler will maintain run histories for 7 days, giving users plenty of time to diagnose issues and respond appropriately.

Many Prefect Cloud features are included with Scheduler, including secrets (a way to provide sensitive information to your flows) and versioning (automatically promote the latest version of your flows).

What will you build?

We have been so fortunate to work with amazing individuals and companies over the last year to build Prefect Core, Prefect Cloud, and now Prefect Scheduler.

We hope you find Scheduler as transformative as we have. Please get in touch or join us in Slack to share any feedback.

Happy Engineering!

- The Prefect Team

--

--