Window function in SQL

Shawn
4 min readOct 25, 2022

--

https://www.toptal.com/sql/intro-to-sql-windows-functions

In day-to-day work, it is common to encounter the need to rank within each group, this is when we need to use window function

Window functions, also called OLAP functions (Online Analytical Processing), can perform real-time analysis and processing on database data.

over (partition by <column name used for grouping>
order by <column name for sorting>)

So it was just like first group by a column, then order by another column.

Below case it’s group by column two班级 (Orange line), and in each of group order by score成绩.

select *,
rank() over (partition by 班级
order by 成绩 desc) as ranking
from 班级表
https://zhuanlan.zhihu.com/p/92654574

It can be used for:

1) Dedicated window functions: rank, dense_rank, row_number and other

2) Aggregation functions: sum, avg, count, max, min, etc.

Can also be used as a reference of overall average, if we put nothing in over() function.

Can partition data by 1 or more columns

Why we use window function if there’s already existed GROUP BY function?

This is because group by changes the number of rows in the table after grouping, and a row has only one category. The partition by and rank functions do not reduce the number of rows in the original table. For example, the following counts the number of people in each class.

I believe you can recognize which is using window function
select *,
rank() over (order by 成绩 desc) as ranking,
dense_rank() over (order by 成绩 desc) as dese_rank,
row_number() over (order by 成绩 desc) as row_num
from 班级表
different between these three functions.

Sliding window keywords

  • UNBOUNDED: start or the end of partition
  • PRECEDING: data before
  • FOLLOWING: data after
  • UNBOUNDED PRECEDING means that the window starts from the first column of the partition; offset PRECEDING means that the window starts from the number of columns equal to the offset value before the current column. UNBOUNDED PRECEDING is the default value.
  • CURRENT ROW indicates that the window starts or ends in the current row.
  • UNBOUNDED FOLLOWING means that the window ends at the last column of the partition; offset FOLLOWING means that the window ends at the number of columns equal to the offset value after the current column.

Sales and sales_1 result are different, one is the current row and another is from start to current row.

Case study 1:

You will be creating a data set of games played by Legia Warszawa (Warsaw League), the top ranked team in Poland, and comparing their individual game performance to the overall average for that season.

In this case, use CASE WHEN to categorize they are in home or away, and use window function to group by season.

--

--

Shawn

Self taught — Data Analyst | Business Intelligence Specialist | Business Analyst | Data scientist