Convert Group By Grouping Sets (Teradata) to Bigquery
Introduction:
Group by Grouping Sets is a teradata extension to the group by clause that allows performing multiple group by clauses in the single statement.
GROUP BY GROUPING SETS is equivalent to the UNION of two or more GROUP BY operations in the same result set.
Empty parentheses () specify a single grand total group that sums all the groups specified in the query.
Prerequisites
- Basic knowledge of the Teradata data warehouse and its SQL syntax
- Good understanding of Bigquery and its SQL syntax
Conversion Logic
Example Scenario:
Lets says, the requirement is to find out the sales_summaries by state within the country, by city and the overall sales_summary.
The sample data looks as below
Teradata SQL:
SELECT country, state, city, SUM(sales_amount) as total_sales
FROM sales_table
GROUP BY GROUPING SETS ((country, state),(city),());
Query Output:
Bigquery SQL:
SELECT
sales_table.country AS country,
sales_table.state AS state,
NULL AS city,
sum(sales_table.sales_amount) AS total_sales
FROM
sales_table
GROUP BY 1, 2UNION ALL
SELECT
NULL AS country,
NULL AS state,
sales_table.city AS city,
sum(sales_table.sales_amount) AS total_sales
FROM
sales_table
GROUP BY 3UNION ALL
SELECT
NULL AS country,
NULL AS state,
NULL AS city,
sum(sales_table.sales_amount) AS total_sales
FROM
sales_table
;
Translation Explanation:
- GROUP BY GROUPING SETS((x),(y),(z)) is equivalent to GROUP BY x UNION ALL GROUP BY y UNION ALL GROUP BY z
- Hence the grouping set ((country, state),(city),()) will have two UNION ALL and groups (country and state) (1st Query) and (city) (2nd Query)
- The Empty parentheses () specify a single grand total group that sums all the groups specified in the query. Hence there is no group by clause in the 3rd Query
After these changes, Bigquery SQL generates the same output as Teradata.