Difference between GROUP BY and PARTITION BY

Gunjan Sahu
2 min readAug 1, 2023

--

In SQL, GROUP BY and PARTITION BY are used to group data based on some criteria. While both clauses do grouping, there are key differences between them.

The GROUP BY clause is used in SQL queries to define groups based on some given criteria. These criteria are what we usually find as categories in reports. Examples of criteria for grouping are: group all employees by their annual salary level, group all trains by their first station, group incomes and expenses by month, group students according to the class in which they are enrolled. Using the GROUP BY clause transforms data into a new result set in which the original records are placed in different groups using the criteria we provide. We can perform some additional actions or calculations on these groups, most of which are closely related to aggregate functions.

On the other hand, PARTITION BY is used to divide a result set into partitions and perform computation on each subset of partitioned data. It is used with window functions.

Here is an example of how GROUP BY works:

Suppose we have a table named “students” with columns “name”, “class”, and “marks”. We want to find out the total marks obtained by each student in each class.

SELECT name, class, SUM(marks) as total_marks
FROM students
GROUP BY name, class;

Here is an example of how PARTITION BY works:

Suppose we have a table named “students” with columns “name”, “class”, and “marks”. We want to find out the total marks obtained by each student in each class along with the average marks obtained by all students in that class.

SELECT name, class, marks,
AVG(marks) OVER (PARTITION BY class) as avg_marks
FROM students;

I hope this helps! Let me know if you have any other questions.

--

--