Orchestrate Your Data Science Project with Prefect 2.0

Make Your Data Science Pipeline Resilient Against Failures

Published in

The Prefect Blog

9 min readJun 29, 2022

Motivation

There are a lot of components of a typical data science pipeline such as loading data, processing data, training a model, and making predictions. As a project grows, the number of components, as well as the dependencies between them, proliferate.

If each component has an independent chance of failing, it increases the likelihood that the entire pipeline fails with each run. Thus, it is inevitable that there will be failures in your pipeline.

Instead of preventing failures from happening, we should write code so that if a failure occurs, our pipeline will:

fail gracefully
recover quickly

How can we do that? That is when negative engineering comes in handy.

What is Negative Engineering?

Before talking about negative engineering, let’s talk about positive engineering. Positive engineering is writing code to achieve a certain objective. That objective could be:

training a good ML model
gaining insights from your data

Orchestrate Your Data Science Project with Prefect 2.0

Make Your Data Science Pipeline Resilient Against Failures

Motivation

What is Negative Engineering?

Written by Khuyen Tran