ON DATA ENGINEERING

Leveraging DBT as a Data Modeling tool

Reflections on one year of using DBT for modeling a data warehouse

Julien Kervizic
Hacking Analytics
Published in
7 min readJun 24, 2021

--

Photo by Artur Shamsutdinov on Unsplash

DBT is a tool that aims at facilitating the work of analysts and data engineering in transforming data and modeling within a data warehouse. It provides a command-line as well as a documentation and RPC server.

After more than a year working with DBT, I thought it would be good to reflect on what it offers, what it is currently lacking, and what features might be desirable to have incorporated in the tool.

Jinja capabilities

Jinja is a python templating engine, used in data tools such as Airflow, Superset, or infrastructure as code tools such as Ansible.

DBT leverages Jinja, at the same time as a wrapper around its model, to provide configuration objects or to define macros. Models can then re-use Jinja artifacts. In addition, DBT uses Jinja templates within its’ core inner workings; as a result, Jinja ends up a first-class citizen in the tool.

I have previously written on the use of templating SQL with jinja for increased code re-use and legibility. Contrary to a typical implementation of Jinja, the DBT’s does not make it easily accessible to enrich the functionalities as…

--

--

Julien Kervizic
Hacking Analytics

Living at the interstice of business, data and technology | Head of Data at iptiQ by SwissRe | previously at Facebook, Amazon | julienkervizic@gmail.com