Schedule & orchestrate dbt Cloud jobs with Prefect

Modular Data Stack with dbt Cloud Prefect block

Anna Geller
The Prefect Blog
Published in
5 min readDec 6, 2022

--

Marvin orchestrating dbt Cloud jobs

⚠️ Note: I no longer work at Prefect. This post may be completely outdated. Refer to Prefect docs and website to stay up-to-date.

This short post will walk you through how to set up dbt Cloud jobs and orchestrate those with Prefect. It assumes that you have already signed up for dbt Cloud and know how to use dbt.

dbt Cloud setup

First, you need to retrieve the dbt Cloud account ID and create an API key. To do that, go to Account Settings:

In the URL, you should now see the account ID. Copy it and paste it into DBT_CLOUD_ACCOUNT_ID in your .env file. Now from the same Account Settings page, go to the API Access section:

Create and copy the API key:

Programmatic block creation

Now, paste this key into DBT_CLOUD_API_KEY in your .env file. Your .env file should now have those two environment variables:

DBT_CLOUD_API_KEY=xxx
DBT_CLOUD_ACCOUNT_ID=12345

The .env file is only needed if you want to create a DbtCloudCredentials block programmatically. The easiest way is to follow the “How to Build a Modular Data Stack — Data Platform with Prefect, dbt and Snowflake” blog post series and the prefect-dataplatform GitHub repository. You can directly leverage one of the automated deploy scripts, e.g., the local execution setup shown in the deploy_locally.py script.

Block creation from the UI

Alternatively, you can use the dbt Cloud API key and account ID to configure a block from the Prefect UI:

After configuring the credentials, we need to set up our dbt Cloud project. If you already have one, you can skip the section below.

Set up your dbt Cloud project

First, go to your dbt Cloud account and create a new project:

Follow the guided onboarding and fill in your data warehouse credentials:

At the very end, you’ll need to select your Git repository:

Then, you can start developing in the IDE:

Running a simple dbt compile can be helpful to validate that everything is working as expected:

Create a dbt Environment

Before we can create jobs, we need to configure an Environment.

Create environment:

You can name it corresponding to your dev/prod Snowflake data warehouse or schema. You can also point it to a given Git branch corresponding to your dev/prod environment:

Once the environment is created, dbt Cloud will encourage you to create a new job:

Create a dbt job

Follow the “Create New Job” wizard and make sure to disable the schedule so that you can orchestrate that job with Prefect:

Copy the Job ID from the URL

The URL contains the job ID that we will need to trigger that run:

Run a dbt Cloud job with Prefect

We are now ready to trigger a dbt Cloud job from Prefect. Here is a simple flow that loads the credentials block (which securely stores the dbt Cloud account ID and API key) and triggers a run for that specific job ID (adjust to match yours on line 10):

Once you run this Python script, you’ll be able to follow the logs either from the terminal:

Or from the dbt Cloud:

Or from the Prefect Cloud UI:

The logs in Prefect Cloud UI make it easy to navigate from Prefect to the logs in the dbt Cloud dashboard:

Next steps

This was a short demo showing how to set up a dbt Cloud job and orchestrate it with Prefect blocks and the prefect-dbt collection. To learn more about building a dataplatform with Prefect, dbt, and Snowflake, including learning how to schedule this dbt Cloud flow, check out our full tutorial series:

If anything discussed in this post is unclear, feel free to tag me when asking a question in the Prefect Community Slack or Prefect Discourse.

Thanks for reading, and happy engineering!

--

--

Anna Geller
The Prefect Blog

DevRel, Data Professional, Cloud & .py fan. www.annageller.com. Get my articles via email: https://annageller.medium.com/subscribe