How to Fix an Error in the DBT Command with the Empty Flag Due to _TABLE_SUFFIX or _PARTITIONTIME

Utilize dbt variables to fix the error

Fumiaki Kobayashi
Art of Data Engineering
3 min readJul 18, 2024

--

Since dbt core 1.8, the empty flag that allows us to create empty resources has been added.

When this flag is specified, a condition limit 0 is added to your query, which means that the output will be empty as well as there will be no cost of the scanning.

This function can be used in a variety of ways, for example,

  • You have written some new queries and want to check that the all of SELECT statements are valid within 30 seconds.
  • You want to write a yaml with dbt-osmosis, so you want to create empty tables.

I often hesitate to repeatedly execute queries that require a lot of analysis cost in DWH for debugging, so I am grateful for such cost-saving features.

Photo by Bobby Donald on Unsplash

However, you need to be careful when using this feature in the case of queries that select pseudo-columns such as _TABLE_SUFFIX and _PARTITIONTIME in BigQuery.

For example, suppose we have the following model.

SELECT
*
FROM
{{ source("ga4", "events_*") }}
WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", CURRENT_DATE("+9") -3 )
AND FORMAT_DATE("%Y%m%d", CURRENT_DATE("+9"))

Running dbt run with the empty flag on this model results in the following error.

Unrecognized name: _TABLE_SUFFIX at [6:3]

To investigate the error, Let’s look at the compiled query.

SELECT
*
FROM
(select * from `bigquery-public-data`.`ga4_obfuscated_sample_ecommerce`.`events_*` where false limit 0)
WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", CURRENT_DATE("+9") -3 )
AND FORMAT_DATE("%Y%m%d", CURRENT_DATE("+9"))

You can see that when adding the empty flag, the data source referenced by source or ref is wrapped as a subquery with limit 0.

Since the pseudo column is no longer retained after the subquery select * is executed, it is no longer possible to select _TABLE_SUFFIX in subsequent queries. That is the cause of the error.

This behavior was reported immediately after the 1.8 release. In fact, this behavior is not a bug, and therefore, a workaround should be considered.

A simple way to handle this is to use flags, a variable that can be used in Jinja to determine whether or not cli options are specified.

We need to look at the dbt-core code to understand the details, but we can see that True/False is stored as the return value of flagsand thus can be used in the same way as is_incremental().

If the scan amount becomes zero by the emptyflag, there is no need to filter by _TABLE_SUFFIX, so all we need to do is to use this variable to not refer to _TABLE_SUFFIX if the emptyflag is specified.

SELECT
*
FROM
{{ source("ga4", "events_*") }}
{% if not flags.EMPTY %}
WHERE
_TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", CURRENT_DATE("+9") -3 )
AND FORMAT_DATE("%Y%m%d", CURRENT_DATE("+9"))
{% endif %}

Nevertheless, there is a slight problem with this method: the query that has require_partition_filter = true and uses _PARTITIONTIME as the partition column still cannot be executed.

In that case, There are still some other solutions you can take:

  • remove require_partition_filter
  • specify a real data column as the partition column instead of using a pseudo-column partition
  • rewrite the behavior of empty flag in combination with other Jinja functions as described in the comments in the issue.

However, a better method may be available by the time dbt core 1.9 is released since dbt-core is updated quickly.

In addition, while these dbt-specific variables and functions are useful and utilizing them makes us feel smarter, you are careful of using them excessively because the learning cost of new members rapidly increases.

--

--