DBT / DBT CERTIFICATION

Preparing for the dbt 'Analytics Engineering' Certification

Hints, tips and advice in preparing for the exam

Paul Fry
Geek Culture

--

This post aims to provide advice and insights for those working towards the dbt Analytics Engineering certification, the advice I wish I had!

Photo by Thought Catalog on Unsplash

Agenda

  1. Exam Overview
  2. Exam Candidate
  3. Study Approach and Resources Used
  4. Certification Review

Exam Overview

The exam was first introduced in June 2022, and the high-level exam details are as follows:

  • Duration: 2 hours
  • Number of questions: 65
  • Passing score: 65%
  • Price: $200 USD
dbt Analytics Engineering Certification Exam Overview | getdbt.com

Where the exam consists of questions based on the following topics:

  • Developing dbt models
  • Debugging data modelling errors
  • Monitoring data pipelines
  • Implementing dbt tests
  • Deploying dbt jobs
  • Creating and maintaining dbt documentation
  • Promoting code through version control
  • Establishing environments in a data warehouse for dbt

dbt have also provided the following certification study guide.

Note: The Exam Doesn't Exclusively Cover dbt Content

When preparing for the exam, remember that the certification is for dbtAnalytics Engineering' rather than a certification exclusively for dbt. As a result (and as indicated by the last two bullet points above), the exam doesn't solely cover dbt. Key examples of non-dbt topics include:

  • Jinja templating
  • Git workflows
  • SQL, e.g., recommended use of CTEs

Exam Candidate

On dbt's certification webpage, it mentions how the recommended experience of an exam candidate are those with:

  • 6+ months of building experience on dbt Core or Cloud
  • and SQL proficiency
dbt Analytics Engineering Certification Exam Overview | getdbt.com

However, I recommend candidates have at least one year of hands-on experience. I've been using dbt for two years, but importantly have over a year's experience designing and deploying dbt in production. This is key to understanding implementation patterns and everyday use case considerations, e.g.:

  • Options available on how to orchestrate and run dbt jobs in production
  • What agreed dbt-Git workflow should you use as a team
  • To implement CDC, what dbt snapshot strategy to use (timestamp vs check)
  • Understanding how to manage 'hard deletes' with dbt, e.g., being aware of the invalidate_hard_deletes option
  • Understanding the range of dbt tests (and dbt test packages) available and how to store the results using the store_failures option

Hands-on Experience Isn't Enough

The more significant point commonly mentioned amongst the dbt slack community is how more than hands-on experience is needed to pass the exam. Instead, candidates should have a firm understanding of supporting dbt reference documentation and other resources described below.

Study Approach and Resources Used

As a starting point, I naturally went through dbt's certification study guide as my first port of call.

Free Online dbt Training Courses

The study guide says to go through dbt's (free) online courses. I found these courses all useful, though I recommend the 'dbt fundamentals course' for onboarding developers new to dbt.

dbt's (free) online training courses | courses.getdbt.com

First Key Recommendation: Study the dbt Docs, Reference and Guides Documentation

I recommend anyone considering sitting the exam first study the official documentation from the dbt website in detail. A common theme on the dbt slack community #dbt-certification channel is how experience isn't enough and how developers have yet been exposed to much of the dbt functionality. I'd recommend cloning dbt's jaffle_shop project and going through dbt's documentation to replicate the features described.

The dbt website documentation I found particularly useful was:

dbt debugging approach on dbt Guides | getdbt.com

Second Key Recommendation: Become Familiar with dbt Resource Property Configs & Jinja Functions

I recommend having a detailed understanding of the different config options available for differing dbt resources (documentation link). A good way of putting it — are you confident of the varying config options available for dbt sources, as shown below? Would you be able to write these from scratch?

Example of some of the source property config options available

dbt Jinja Functions (link)

An array of jinja functions is available in dbt to help make your code DRY. However, I recommend creating your own simple macro to understand what common dbt Jinja functions and variables are available.

Example of some of the dbt Jinja functions available

dbt Blog Posts

Aside from the training courses, the prep guide lists links to dbt blog posts — where these themes came up in the exam! The blog posts I found particularly useful and pertinent to the exam are as follows:

Posts Relating to dbt Project Structure

Posts Relating to Git Workflows for dbt

dbt Community on Slack

As well as the above, I found the dbt slack community very handy. There is a slack channel dedicated to certification chat called #dbt-certification which is very useful for understanding common themes/questions from others.

Note: Knowledge of dbt Cloud isn't Required

One of the common questions asked on the dbt slack community is whether knowledge of dbt Cloud is required for the exam — i.e., will any questions come up relating to dbt Cloud? Looking at the exam curriculum, it's easy to see why — with section 5 talking about dbt jobs, this sounds like dbt jobs in dbt Cloud, right?

Well, the answer is no! Knowledge of dbt Cloud isn't required. According to dbt labs staff in the dbt slack community, there are no dbt Cloud-specific questions on the exam. And that "any questions related to jobs should be accessible to anyone who has defined a job, regardless of if it's in dbt Cloud or a third party orchestration tool."

Summary of Recommendations

In summary, I recommend exam candidates to:

Certification Review

The objective of dbt labs in producing the certification is to educate and establish specific standards/patterns in how they would like people to use dbt — I think it has definitely achieved that. Having gone through the above study materials, I've regularly revised template scripts I use to follow some of the best practices and naming conventions described. In addition, I found that going through the breadth of documentation highlights lots of really beneficial but potentially not obvious functionality, e.g., the dbt test store_failures option.

Anyways, I hope some of this information is of use to others. Reach out to me if you have any questions!

--

--

Paul Fry
Geek Culture

Welsh data architect, based in Dublin. Certified in dbt, Airflow, Snowflake & AWS