Prefect 0.11.0: Improved APIs (and Targets!)

Christopher White
The Prefect Blog

--

We’re excited to announce the release of Prefect 0.11.0! This release is the culmination of months of work to improve our users’ ability to build expressive and complex workflows. In particular, 0.11.0 ships with new result-handling classes, a new target-based caching API, standardized Secrets, and an improved API for working with conditional logic.

We also want to give a shoutout to all the amazing work from our contributors over the past few releases, including new features, bug fixes, great discussions in Slack, collaboration on Github, tutorials and working groups in our weekly meetup, and, of course, new tasks in the growing task library.

For the full list of enhancements and features, checkout the release changelog.

Prefect’s mission is to provide the data community with robust building blocks for defining and deploying modern workflows. Our design goal is to maintain an intuitive path from local development with sensible defaults all the way to deeply configurable production deployments. We have always welcomed user feedback and listen carefully to any frictions our users encounter. Now that Prefect is used by thousands of engineers, we can discover and address the community’s most pressing needs more quickly than ever.

In this release, we focused on improving in five major areas of user experience:

  • Expanding options for working with task results
  • Adding flexible, target-based caching
  • Standardizing our Secrets API and recommendations for third-party authentication
  • Introducing a new way to write conditional logic inside flows
  • Allowing more expressive trigger logic

See below for a deeper dive into the headline features — we look forward to see what new things you build!

📤 Task Results

Dataflow — the idea that tasks can pass information, metadata, and data to each other — is a cornerstone of Prefect. Therefore, perhaps the biggest change in 0.11.0 is a refactor of how Prefect works with task results, expanding users’ fine-grained control over their data.

Previously, Prefect supplied a simple read/write class called a “Result Handler” that allowed users to define where task results were stored. This approach was a success, enabling key features like Prefect’s Hybrid model, but we quickly learned that users wanted to do much more with their data: they wanted to run validations, generate artifacts and progress updates while tasks ran, configure transparent caching against various backends, and even see richer representations of it in the Prefect UI.

Prefect 0.11.0 includes a new and extremely user-friendlyResult interface for interacting with task data. It already enables new caching functionality (see below), and will soon allow users to configure custom data validation on a per-flow or per-task basis. In addition,Results are no longer limited to task outputs; users can now publish intermediary results that could represent progress bars, machine learning training updates, checkpoints, or data quality reports. We are planning a feature for the Prefect UI that will allow direct rendering of certain types of structured artifacts, including tables, JSON, images, progress, and more.

This change is non-breaking, and any result handlers from prior versions of Prefect will be auto-converted into the matching Result subclass from 0.11.0. You can read more about the new Result classes, including tips for migrating to the new interface, in Prefect’s documentation.

🗃 Target-Based Caching

We’ve already listed a few of the use cases that the Result refactor enables, but the one we’ve been most excited to ship is the ability to cache tasks based on the presence of persisted data, such as a file on local disk or in cloud storage. This has been a common request from users who are accustomed to file-based caching from tools like Make or Luigi.

Tasks in 0.11.0 may now specify a string target keyword that represents a pathname whose existence should be checked before running a task. If the location exists, its data will be read into memory and set as the Cached result of that task, without redoing any computation. This means that skipping an expensive task is as simple as writing a file. Note that all Result classes support this feature, so referencing files in cloud storage like S3 or GCS is fully supported. Future implementations will include non-file-based caching, like database entries.

With this new targeting system, reusing output for slow or expensive tasks in a flow when rerunning it, or even when changing and experimenting with downstream tasks of flows, becomes far easier: if the target exists, the task skips! Prefect has always supported Cached states, but this feature should dramatically expand their utility while simultaneously reducing the learning curve.

Target locations can either be a hard-coded path, or they can be templated with values from Prefect’s context for expressive custom logic. As an example,"{flow_name}/{task_name}/{date:%A}" represents a valid Prefect target that will be formatted at runtime with the appropriate values (using Python’s standard string formatting rules). Users can take advantage of this to cache on a per-task, per-flow, per-day, or even global basis.

This makes it incredibly easy to specify file targets for entire flows with one line and keep them organized. In particular, we know that Prefect users in the Python scientific community will benefit from this new feature, as it aligns more closely with their expectations. For a simple example, see Prefect’s fast-growing idiom documentation.

🔐 Secrets

There are many ways to handle secrets and authentication for your Prefect tasks. From the low-level Prefect Secrets API, to the task library’s Secret Tasks, to classic environment variables, we honestly found that users were a little overwhelmed by the menu of choices.

In 0.11.0 we sought to make it easier for users to use the Secrets API and Secret Tasks in your flows by introducing some standardization and a better off-the-shelf experience. For a full writeup of the new recommendations and patterns, see the new secrets deployment recipe.

In particular, Prefect Cloud users can now declare the names of Prefect Secrets they want access to within their flow and Prefect will auto-populate context with their values. Many Prefect built-ins are newly designed to first check context for authentication information before falling back on other defaults.

(And, since we just told you all about result handling and caching, we want to clarify that Secret tasks never persist their output!)

For more information on the expected structure of such secrets, see the current list of secrets that provide automatic authentication to various third party services (this list will continue to grow over the next few releases).

✅ New Conditional API

We know that one of the major reasons people choose Prefect is because writing Prefect code is as simple as writing Python code. We designed our Python API to be as lightweight and flexible as possible, but one area that has been historically challenging is adding conditional logic to flows. Though we boiled it down to an ifelse() function, it still required some knowledge of how Prefect communicates states between tasks, and, to be honest, didn’t meet the same user experience standard as the rest of the API.

In Prefect 0.11.0, we’re bringing conditional logic up to spec. Just as Python uses indented if statements for conditional logic, we’ve found a clean, indented syntax for conditional code by adding a new case context manager.

For example, say our flow took a username parameter and sends a postcard if it’s "arthur", but sends an email if it's "marvin". Now you can write Prefect code that looks just like this:

Click here for the full example

As expected, providing a parameter value of “marvin” to this flow runs only the “marvin” branch:

The case API works with both Prefect’s functional and imperative APIs, and will properly gate any tasks that you put inside the with case block. It also fully supports existing control flow idioms like merging conditional branches back together for common downstream tasks. This new API dramatically improves the user experience of working with conditional branches, and we look forward to seeing the use cases it unlocks.

🔘 Trigger Refactor

Prefect’s task triggers determine whether a task is ready to run, based on the state of its immediate upstream dependencies. For example, tasks can run if all upstreams succeeded (the default), or if at least one failed. The 0.11.0 release updates the signature of Prefect’s trigger functions to accept a richer set of upstream information, allowing more complex trigger behavior. This means that users can now write custom triggers for their tasks that react to specific upstream tasks.

As a simple example, the following custom trigger will prevent the downstream task from running by entering a TriggerFail state if an upstream task named “aggregation” was skipped.

Custom trigger that allows the downstream task to proceed only if the upstream task “aggregation” was not skipped

We expect this new signature to be used in conjunction with the conditional API improvements described above to offer an incredibly expressive conditional vocabulary for succinctly defining when tasks should run.

Our work continues!

Since open sourcing some of our backend stack with “Project Earth”, there has been an incredible increase in contributions from the community along with high quality feedback on Prefect Core’s server and UI. We are excited to keep improving the open source platform based on the community’s feedback, and can’t wait to see what you build next!

Please continue reaching out to us with your questions and feedback — we appreciate the opportunity to work with all of you!

Happy Engineering!

— The Prefect Team

--

--