Debug your dbt models like a superhero

Josh Hall
8 min readApr 28, 2023

When it comes to data modeling, dbt has quickly become the de facto standard. As dbt has been rapidly adopted as a staple of the data stack, there have not been complimentary advancements for features allowing data practitioners to work with and understand the core component of dbt: models.

Sure, there are data lineage tools available to help understand how models relate to each other, including dbt’s dag. Dbt’s internal docs can also be helpful to a certain extent, but don’t provide a lot of context into the logic of the actual dbt model. If you know how to traverse your dbt project to find the compiled SQL of your dbt model runs, you may be a step closer in obtaining SQL you can run in an editor, but a SQL editor is not the most conducive for breaking down dbt models, especially more complex models.

Photo by Abraham Barrera on Unsplash

Another real danger to the understanding of models presents itself in the form of data practitioners, who may have introduced and implemented dbt at organizations, moving to different roles at other organizations. These data professionals are likely the ones who built the majority of the dbt models in production. Ensuring that any of the models that were created by these practitioners can continue to run and be maintained is often imperative to the performance and success of the data function at an organization.

Ideally, in all of these cases, the dbt models are running, but what happens when it’s time to update, understand, or more importantly, debug a dbt model? In many cases, this consists of pulling the dbt model into a SQL editor, separating CTEs into their own editor tabs, and attempting to stitch together the model in this manner. A very siloed, manual process.

A better way to debug with dbt models

Count provides a workflow that eliminates these pain points and provides a level of transparency not obtained before when debugging dbt models. By working with dbt models in Count, you enable a collaborative approach to updating and debugging your models, all while gaining features that allow you to easily break down your model, quickly test changes, and isolate logic, without swapping between multiple tabs and pushing multiple changes to your code repository for review.

Exploding CTEs

A powerful feature that marks an ideal starting point for debugging your dbt models in Count is exploding CTEs. This feature allows you to take a model with multiple CTEs and split out the logic of each CTE into its own SQL cell. Additionally, you can work with the output of each individual CTE, as well as the output from the final SELECT from the model. As you can infer, this sets the stage to have a significant amount of control during the debugging process.

Debugging Options

Once you have granular control of the logic in your dbt model through exploding CTEs, there are several ways Count provides value when debugging dbt models.

Real Time Updates

The first and simplest method for debugging dbt models is making the changes directly in the SQL cells that Count has spatially laid out. If you are working with CTEs, you’ll notice the reference lines between SQL cells (each individual CTE), allowing you to see the direction of the logic flowing through the model. Additionally, count provides an output pane for each SQL cell, allowing you to see how the logic is affecting the results of your query in each CTE.

With this in mind, debugging becomes significantly easier and more transparent, as you update the logic in each SQL cell, you can see how it affects the broader model, as the logic continues to flow through the SQL cell references. Although relatively simple, this workflow already provides a significant improvement to the existing solutions available to debugging dbt models. But there are several more powerful features.

Single Cell Changes

Let’s assume that after splitting out your dbt model in Count, you’ve been able to easily identify a specific CTE causing a data quality issue. The logic in the current CTE is relatively complex and you’d like to keep working in Count without having to reference your code repository in case you need to revert changes.

In this case, you can employ a Count feature that allows you to make a copy of the CTE and run it independently of the current logic. By simply running a copy command (command + C / ctrl + C) you can paste the SQL cell anywhere in your canvas. Now you can begin updating the logic to resolve the issue in your copy, while also comparing the changes to the preexisting CTE, all in the same place. Really powerful. Once you’re satisfied with the changes, you can then swap the reference from the logic of the old SQL cell to the updated SQL cell. Fast, efficient, and highly effective.

Entire Model Changes

Now lets assume you’ve identified a dbt model that contains a significant amount of deprecated logic that needs to be resolved. In this case, most, if not all of the CTEs in the model are going to need to have logic changes implemented. At the same time, you want to ensure that the changes you make to the new model are correct, and want to test those changes against the old model.

Count supports this use case particularly well. By utilizing the functionality explained in the previous section, we can copy the entire dbt model (SQL cells) and work with the new SQL cells side by side of the previous logic. This provides a seamless experience allowing practitioners to quickly and easily compare the impact of their changes on the new model. This is incredibly powerful functionality, as this type of work typically involves multiple tables, auditing, and code review. Now it can all be done in one tab, in one canvas, in one tool. Once your changes are complete, you can run tests on the results between the output of the previous model and updated model to ensure your changes are correct.

Export Compiled SQL

After you have made all the necessary changes to your dbt model in Count, you need a way of moving the code back into the SQL file within your dbt project. One way of doing this would be to manually copy and paste the changes you have made within Count. While this works, Count makes this process significantly easier by providing you the ability to export the compiled SQL of your changes.

By exporting compiled SQL from Count SQL cells, you not only copy the SQL contained in the cell, but also all of the cells it depends on as CTEs in the query. By right clicking on any cell, a user will be given this option to copy compiled SQL. This allows you to have a quick, one stop process for easily porting your updates back into your SQL file in your dbt project.

Visualizing Changes

While having the ability to easily split out complex models and quickly write SQL is incredibly powerful, Count provides an additional layer of functionality that enhances the dbt model debugging experience.

Count supports visualizations from any SQL cell. With this functionality, a user can more effectively audit and understand how any changes made to the logic of a model are impacting the results. This eliminates the need to run ad hoc statements to validate results as you work. You have a helper, in the form of a visualization, that clearly communicates any impact of changes made, far more effectively than scrolling through an output pane or running one off queries.

Context in a Collaborative Environment

While all the functionality mentioned above provides a way of debugging dbt models not previously available to data practitioners, Count adds a cherry on top by allowing all of this work to be done in a collaborative, real time environment.

Instead of making changes to a dbt model in a SQL file, pushing the file to your code repository, waiting for a review, addressing any comments, and potentially having to revise work (causing this loop to repeat), all of the work to debug a dbt model can be done beforehand in Count, with more collaboration and transparency than a code repository can give.

Any co-worker on your data team can now see changes you are proposing. Comments and suggestions can be left easily. Co-workers can even provide support by making changes or enhancements alongside you. Additionally, stakeholders can also have access to this process, so they are aware of any changes that may impact decisions or reporting, before the change is ever pushed. These stakeholders may also have deeper insight into how the logic should be working, and by including them in this process, the debugging process has been enhanced beyond a more traditional workflow involving code repositories, meetings, and dozens of DMs.

In Summary

What we have discussed in this article has not traditionally been available to data professionals working with dbt models. The current process for debugging and understanding dbt models is clunky and inefficient. Count provides a workflow that allows for the seamless debugging of dbt models, in an environment that provides additional context and encourages the collaboration of all stakeholders. A win win for the entire organization. If you want to give Count a test drive and see this functionality for yourself, check them out at count.co.

I’m a freelance data content creator who has spent my career working in a wide array of data roles. From large consulting firms to small data start-ups, I’ve worked on challenging problems across the entire data spectrum. I love to discover new tools and learn how they can help organizations better solve problems. Feel free to reach out (Josh Hall) in the Count Slack community if you’re a member (and if you’re not a member, you should join)!

--

--