Prefect + Great Expectations

Introducing: The Artifacts API

Create native Prefect UI integrations with one line of code

Christopher White
The Prefect Blog
Published in
4 min readDec 8, 2020

--

We are delighted to announce an exciting and much-requested new feature in Prefect 0.13.19: the Artifacts API! Thanks to this powerful and flexible building block, users can begin publishing data from their task runs that is rendered natively in the Prefect UI. Best of all, the developer experience reflects what users have come to expect of Prefect: by adding just one line of code, any library can have a first-class UI integration.

To demonstrate how effective this API is, we’re also announcing an off-the-shelf integration with Great Expectations that was built entirely in the open, on top of the public API. The same patterns we used to build this integration are immediately available for all users and libraries to take advantage of, and we can’t wait to see what they build!

Task Run Artifacts

One of the original motivations for Prefect’s design was the ability to handle explicit data-dependencies in a first-class way. This made an enormous difference for users that want to pass data between tasks, checkpoint or cache it for future runs, or use pluggable storage backends. It has become a basic requirement for any modern workflow tool.

But this only scratches the surface. There are other, sometimes more interesting, modes of data processing that involve the production of structured data without explicit data dependencies. For example, many of our machine learning users want to publish performance graphs for their models during training. Others want status updates from long-running tasks, or to publish links to relevant internal systems. Yet others want to publish documentation, sample data, or data quality checks from in-progress tasks. Ultimately, we want to enable entire “mini-applications” based on runtime-discovered data. This is where the new Artifacts API comes into play.

The Artifacts API allows users to publish structured metadata from their task runs that can be natively rendered in the Prefect UI! To get started, the API currently supports two flexible artifact types: link and markdown.

The Prefect README, seen as a markdown artifact in the Prefect UI

As we collect feedback from the community, we expect to greatly expand the collection of supported types to include progress bars, model outputs, tables, JSON, diagnostics, images, and more! Users can even contribute directly to this feature’s development, because our UI is completely open-source.

To get started, all you have to do is add one of the following lines of code to any task orchestrated through a Prefect API. After the task runs, navigate to the UI’s Artifacts tab to see the output:

from prefect.artifacts import create_link, create_markdown
# publish a markdown artifact
create_markdown("# Hello!\n Place markdown text here.")

# publish a link artifact
create_link("http://prefect.io/")

Yes, it’s really that easy!

Note about the “beta” flag: You might have noticed that the UI Artifacts Tab currently has a “beta” flag. This flag is aimed at our users who love to scale up and out with thousands upon thousands of mapped tasks. There are still some performance optimizations required to support such flows, and we will be tackling them in upcoming releases. If you’re not one of those users, consider this feature ready to go!

Great Expectations

If you haven’t already heard, Great Expectations is a popular Python framework for automating data testing and profiling. Many Prefect users depend on Great Expectations to ensure the quality of their data. When the Great Expectations team reached out to collaborate on a Prefect / Great Expectations integration, it couldn’t have been better timing — we were already in the design phases of our new Artifacts API and Great Expectations’s reports were a perfect fit! With the latest release, users can think of the new RunGreatExpectationsValidation task as an off-the-shelf way to automatically render their Great Expectations validation reports in the Prefect UI, blessed by the Great Expectations team. (See here for their companion blog post).

Example of a Great Expectations validation report rendered in the Prefect UI

At Prefect, one of our design guidelines is to meet our users where they are: to allow them to continue using the same tools, in the same way, without having to modify their code to work with Prefect. Therefore, it is imperative that when we add a new feature to our toolbox, it doesn’t require any insider knowledge or “cheat codes” to apply it to its maximum potential. For this reason, we built the Prefect / Great Expectations collaboration entirely in the open and relied only on user-facing APIs to develop it. The result is a robust and maintainable integration that delivers important information to users exactly when they need it.

Thanks to this pioneering work, anyone can follow the Great Expectations blueprint to add a first-class, native UI integration for their favorite libraries. Publish completely custom reports, or a link to an external resource, or a status update from a long-running task: the possibilities are endless.

Please continue reaching out to us with your questions and feedback — we appreciate the opportunity to work with all of you!

Happy Engineering!

— The Prefect Team

--

--