Analytical SQL Tips Series - GROUP BY ALL

Douenergy
In the Pipeline
Published in
2 min readMar 22, 2024

Google BigQuery has recently added support for a new syntax (currently in previewing) called GROUP BY ALL. You may be familiar with GROUP BY, but what does GROUP BY ALL mean ?

When writing a SQL query, especially one that includes aggregation, We often start with the SELECT clause., like

SELECT xxx, yyy, sum(zz), …..

By the time we finish drafting the SELECT clause of a query that requires aggregation, we’ve essentially decided on the grouping keys to use. Typically you put the grouping keys right at the beginning of the SELECT clause. However, having to repeat these columns again in the GROUP BY clause can sometimes feel repetitive and tedious.

A more convenient query

GROUP BY ALL is quite helpful because it removes the necessity to list non-aggregate columns multiple times, which can be quite convenient. Additionally, it offers a simpler way to adjust grouping columns. Instead of having to make changes to both the SELECT and GROUP BY clauses, you can now manage these adjustments more efficiently, enhancing the ease of query modification.

Snowflake, Databricks, BigQuery, and DuckDB all support this syntax natively. Give it a try in your data warehouse, or our live demo below.

Try it out in our demo

Try it on DuckDB WASM . It takes about 10 seconds to initialize DuckDB and create the table

CREATE TABLE Weather AS 
FROM 'https://recce-dbt-tips.s3.amazonaws.com/central-park-weather.csv';
SELECT STATION,
NAME,
AVG(TMAX) AS average_max_temp,
AVG(TMIN) AS average_min_temp
FROM Weather
GROUP BY STATION, NAME;
SELECT STATION,
NAME,
AVG(TMAX) AS average_max_temp,
AVG(TMIN) AS average_min_temp
FROM Weather
GROUP BY ALL;

You can apply these SQL tip in your dbt code review. For more about analytical SQL tips, follow Recce on Linkedin.

More from In The Pipeline

--

--