Schedule DBT models with Apache Airflow using the Docker container

Kalayarasi Rajendran
BI3 Technologies
Published in
5 min readNov 29, 2022
WORKFLOW FOR DBT SCHEDULING
Workflow for DBT Scheduling

This blog offers guidelines on utilizing a Docker container to trigger DBT models with Apache Airflow.

Below are the steps to be followed to trigger DBT models,

Step 1: To install DBT on the local system.

The command for installing DBT in CLI:

pip install DBT-snowflake

Check DBT Version :

DBT --version

Link to install DBT: Install with pip | DBT Developer Hub (getDBT.com)

Step 2: After that, install Docker on the local system.

Link to install Docker: Get Docker | Docker Documentation

Step 3: Navigate to the appropriate folder to create a DBT project, use the command below.

DBT init DBT_sample

While creating a DBT project, you have to give your snowflake credentials: account URL link, database, schema, role, username, and password, all of which are mandatory to specify.

DBT Project Creation

Step 4: Open the project you just created in Visual Studio Code.

Project Structure

Use the command below to verify the connection test,

DBT debug
Account Connection Test

Step 5: Then make a single model inside the folder containing model files, and run it in the VS code console.

Model Creation

Create a root folder inside the project directory

root folder setup

Step 6: Open Docker to create an account

Docker Home page

Step 7:Create a Dockerfile, such as docker-compose.yml and requirements.txt, and then configure Docker by putting it into it.

Docker file creation:  New-Item (filename) in PowerShell

docker-compose.yml:

version: "3.10"
services:
web:
build:./

requirements.txt:

DBT-core
DBT-snowflake

Dockerfile :

FROM python:3.10.5
WORKDIR /
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
COPY . .
# path destination
RUN dbt clean --project-dir /
RUN dbt deps --project-dir /
CMD ["/bin/bash", "-c", "${cmd}"]
EXPOSE 8080

Step 8: After that, run the model within the docker

For a Docker build, use the command below ,

docker build -t (filename for doc) -f  foldername . /

Example : In this case, use the folder name Dockerfile and the filename dbt.

docker build -t dbt -f Dockerfile ./
Docker Image Creation

For a Docker run, use the command below,

docker run -e cmd='dbt run --project-dir /' dbt (image name)
Docker Container Creation

Step 9: Now start the airflow implementation by creating a DAGs folder inside the project.

The commands.sh and DBT airflow.py files should be created inside the dags folder.

DBT_airflow.py:

Tasks can be scheduled in this file. Hourly scheduling is used here.

from airflow import DAG
import pendulum
from airflow.operators.bash import BashOperator
with DAG(
dag_id='run_DBT',
description='First DAG',
schedule_interval= '@hourly',
start_date=pendulum.datetime(2022, 10, 18,tz="UTC")) as dag:
task = BashOperator(task_id = 'task_run',bash_command='/commands.sh',dag=dag)
task
# Here run_DBT is dag name

commands.sh:

cd /
DBT run --project-dir /

To change the requirements.txt file and the Dockerfile ,

Dockerfile :

FROM python:3.10.5
WORKDIR /
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
COPY . .
COPY /dags /root/airflow/dags
# path destination
RUN DBT clean --project-dir /
RUN DBT deps --project-dir /
RUN airflow db init
ENV cmd="airflow webserver"
CMD ["/bin/bash", "-c", "${cmd}"]
EXPOSE 8080

Requirements.txt :

DBT-core
DBT-snowflake
apache-airflow

Step 10: There must be accounts set up for airflow.

Link to Airflow Account creation : Webserver — Airflow Documentation (apache.org)

Go docker container ->open terminal

Docker Container’s Details
# create an admin user
airflow users create \
--username admin \
--firstname Peter \
--lastname Parker \
--role Admin \
--email spiderman@superhero.org
airflow scheduler
Airflow Account Creation

Step 11: Open the browser through the docker

Open Browser through container’s
To go to the Airflow login page, click here: localhost:8080
Airflow login page

Then move to the Airflow DAGS page ,

Airflow Dages Page

You need to look for your dag within the dags.

Find dag on dag’s page

Now try to trigger the dag ,

Dag Trigger

Once the dag has been triggered, it will execute as depicted below,

DBT Model Scheduled

Finally, using a Docker container and Apache Airflow, constructed DBT models are activated. It can be either run manually or using any trigger for the automatic run.

About Us :

Bi3 has been recognized for being one of the fastest-growing companies in Australia. Our team has delivered substantial and complex projects for some of the largest organizations around the globe and we’re quickly building a brand that is well-known for superior delivery.

Website : https://bi3technologies.com/

Follow us on,
LinkedIn : https://www.linkedin.com/company/bi3technologies
Instagram :
https://www.instagram.com/bi3technologies/
Twitter :
https://twitter.com/Bi3Technologies

--

--