How I passed the dbt Fundamentals certification with Databricks

tl;dr dbt is an open source project for ELT. It enables analytics engineers to transform data by writing SQL in a re-usable way. This article describes how I passed the dbt Fundamentals certification and what you need to know to run the dbt training course with Databricks.

Frank Munz
Geek Culture
3 min readMar 21, 2022

--

dbt is the “T” in ELT. It doesn’t extract or load data, but it’s extremely good at transforming data that’s already loaded into your warehouse. This “transform after load” architecture is becoming known as ELT (extract, load, transform).

how I passed the dbt certification, @frankmunz
Image credit: dbt https://www.getdbt.com/

I got certified in dbt, but let’s make this brief, here is what matters if you want to do the same:

dbt Fundamental Training

  • dbt Labs is offering the online dbt fundamental training course. It’s a good course with excellent presenters. I enjoyed taking that training. Did I mention it is free?
  • The estimated time to work through the course including all the labs is 5h. Myself, I managed to do it on half a rainy Saturday without much prior dbt knowledge. This included the time to figure out the data loading (see below). After reading this article, you might be even quicker!
  • To run the course you need a dbt cloud account (the setup is described in the course) and a Databricks trial account. Both accounts are free.
  • The course is designed for Databricks, Redshift, Snowflake etc but loading the sample data isn’t explained for Databricks at the time of this writing. Since I spend most of my time with Databricks (I am biased, see my bio), I wasn’t interested to use Redshift or Snowflake for the tutorial.
  • Here is the quick-hack notebook that I used to load the sample data into Databricks. The code is SQL only and makes use of Databricks autoloader. I ingested all sample data to a single database, you might enjoy a fancier solution.

dbt for Databricks users

  • The purpose of this write-up is to save you some time if you want to follow along with the dbt training as it is outlined. Note, that for using dbt with Databricks in production, there are much better ways. You might want to start with Databricks partner connect and also check out the dbt blog.
  • Update: check out the Databricks documentation for dbt.
  • For sure, dbt is an interesting skill for any data engineer (pardon, any analytics engineer :-) ). If you are working with Databricks, also check out Delta Live Tables, which supports SQL and Python, data lineage, runtime optimization and brings support for streaming, but eliminates the Apache Jina template language and the YAML files required for dbt. Stay tuned for more, since Databricks Ventures partnered with dbt labs.

The dbt Fundamental Certification

Difficulty

  • Passing the cert after going through most of the dbt labs is moderately difficult. The exam forces you to go through the study material again and also look up some stuff in the official documentation. This is good.

Allowed material for the cert

  • There are some tough questions, but you are allowed to use any material. So take the time to do some research. Obviously going through to course material again gives you an advantage.

Time limit / re-takes

  • There is no time limit that I remember. You can re-take the exam if you like. Since I am a lazy person and there are a lot of questions, I’d recommend getting it right the first time (I had other plans on that Saturday too).

Good luck!!!

I managed to pass the certification on the first attempt. good luck to you!!!

dbt Fundamental certification, Dr. Frank Munz

Please follow me on Medium and clap for this article if you enjoyed reading it. For more cloud-based data science, data engineering, and AI/ML follow me on Twitter (or LinkedIn).

--

--

Frank Munz
Geek Culture

Cloudy things, large-scale data & compute. Twitter @frankmunz. Former Tech Evangelist @awscloud, Principal @Databricks now. personal opinions here. #devrel ❤️.