dbt unit-test framework

Matthieu Bonneviot
Teads Engineering
Published in
3 min readApr 23, 2024

This is a follow-up of my previous article on Unit testing with dbt.

Since the release of dbt v1.8.0-b1, a dedicated framework for unit-testing has been released and at Teads, we decided to try it.

All in one place

No need to define fixtures and expected models, to declare a dbt_utils.equality in a YML file.

All it requires is a single YML file placed in your model folder. I personally store them in a ci folder. The file format is quite straightforward:

unit_tests:
- name: my_test
description: "some meaningfull description"
model: the_model_I_want_to_test
given:
- input:
rows:
- {col1: val1, col2: val2}
- input:
rows:
- {col1: val1, col2: val2}
expect:
rows:
- {col1: val1, col2: val2}

To run your test: dbt test --select my_test

Several tests in one file

Unlike the previous way of conducting unit-tests in dbt, you can now define multiple tests on the same models:

unit_tests:
- name: my_test_1
...
- name: my_test_2
...

Now, it’s possible to independently unit-test every single feature, utilizing the minimal number of inputs and expected outcomes. This greatly enhances the maintainability of the unit-tests.

Upstream Dependencies materialization

As most dbt models refer to upstream models or sources, they must be fixtured as part of the unit tests. Those models are not materialized physically inside your engine but they are inlined in the tested model as an ephemeral one.

It is much easier to debug as everything is inlined, and it eliminates the need to create a separate test project with a cleanup strategy, thus avoiding collisions in your test model names.

Variable override

In our use case, we are using incremental table materialization partitioned by hour. Consequently, on production, our models are executed with an input variable supplied via the dbt command line:

dbt run --select my_model --vars '{date:"2024:04:16 17:00:00"}'

This variable can be specified in the unit-tests enabling us to validate the logic associated with it:

unit_tests:
- name: my_test
description: "some meaningfull description"
model: the_model_I_want_to_test
overrides:
vars:
date: "2024:04:16 17:00:00"
given:
...

Nested structure integration

It’s feasible to define arrays and complex structures:

bidders: 'struct([struct(struct(895528 AS gid, ...) AS element)] AS list)'

There are some limitations to be aware of:

  • Every field of the structure needs to be defined, even if they are NULL. While often specifying unused fields as null works, such as NULL AS cid, this isn’t always the case. For example, for floats, you may need to explicitly cast null as FLOAT64, like CAST (NULL AS FLOAT64) AS price. Consequently, in our case, we ended up with several lines of NULL fields just to set a single field in a structure.
  • The error message provided by the database when the structure is not properly defined is not always clear and doesn’t offer much assistance in fixing it.
Database Error
Invalid cast from STRUCT<list ARRAY<STRUCT<element STRUCT<gid INT64, cid INT64, score INT64, ...>>>> to STRUCT<list ARRAY<STRUCT<element STRUCT<gid INT64, cid INT64, score FLOAT64, ...>>>> at [52:14136]
  • Ultimately, if a developer randomly adds a new field to the structure, it can break our tests on the master branch, which isn’t ideal.

Clear output for the unit-tests’ failures

When a test fails, a visually appealing output with color highlighting aids in identifying the issue quickly:

No support for UDF

As of now, if your model incorporates a User-Defined Function (UDF), the generated SQL for the unit test is incorrect. Consequently, you won’t be able to successfully test your model using the unit test framework.

My feedbacks

This unit-test framework is definitively nicer to use than using dbt_utils.equality: faster to write and debug, more atomic tests, clear output. It has become the new standard at Teads.

--

--

Matthieu Bonneviot
Teads Engineering

Software engineer at @Teads, in love with craftsmanship and high-volume real-time applications.