Advanced SQL for analysts: window functions are not your enemies, P.1

Akhmet (Massomi) Timur
3 min readMay 19, 2022

--

This is the first part of SQL window functions article. Part 2 is here

It is enough to know the basic SQL to be an analyst but if you want to be a good one — for sure you should be able to use window functions. And if you are afraid of this scary monster — believe me after reading this post WF will become your best friend!

Window functions are the calculations for every row within the selected group.

Doesn’t make sense, huh? It will get clearer with the example.

We have a table of students scores with the following columns:

Students are quite good at Music

Let’s get the average score first:

SELECT AVG(score)
FROM students_score;

Nothing complicated. Now let’s divide average scores by faculties and years.

SELECT year, faculty,  AVG(score)
FROM students_score
GROUP BY 1,2

Still not a rocket science. Just a simple aggregate (grouping) functions. Now let’s imagine that we want to compare score of every student with the average score of his faculty. Probably your query will look something like this:

SELECT s.*, q.avg_score
FROM students_score s
LEFT JOIN (
SELECT faculty, AVG(score) AS avg_score
FROM students_score
GROUP BY 1
) q ON s.faculty = q.faculty

Great, it works! But… How long did it take you to dive into the query and understand what’s really happening? Keep in mind that in real life we almost always have many times more complicated cases. And then the window functions come to the rescue!

We need to calculate average score for every student’s row within the faculty.

SELECT s.*, 
AVG(score) OVER (PARTITION BY faculty) AS avg_score
FROM students_score s

Wow! We managed to replace the huge query with just one line of code and got absolutely the same result!!!

The sintax of the window functions is folllowing:

AGGREGATE_FUNC( value ) OVER (PARTITION BY grouping_value)

Visually it looks like this:

For every row we get avg score grouped by faculty

Also we can find min and max scores and show it on every row:

SELECT s.*, 
AVG(score) OVER (PARTITION BY faculty) AS avg_score,
MIN(score) OVER (PARTITION BY faculty) AS min_score,
MAX(score) OVER (PARTITION BY faculty) AS max_score
FROM students_score s
Resulting table

That’s all for today. I recommend to practice more and play around with this new functionality you’ve just learned and in part 2 we will dive more into window functions.

Feel free to connect with me on linkedin.

‘You don’t learn to walk by following rules. You learn by doing, and by falling over.’ –Richard Branson

--

--