How to create windows and use window functions for pretty advanced SQL analysis.

Dominic Imbuga
DevCNairobi
Published in
3 min readAug 11, 2021

One of the most powerful concepts in SQL data analysis is the window function, it took a little practice for me to really nail it and it become extremely useful on my Machine learning journey as you will see here in this article.

Learning to use this tool in different situations made my data analysis effective. The Window function allowed me to compare one row to another without doing joins! This meant I could learn to do simple things like create a running total as well as tricky things like to determine whether one row was greater than previous row and classify it based on my findings.

The most practical example of a window function is a running total, for example, let us say you want to calculate a running total of how much standard paper Kiwanda sold today.

You can see that this query creates an aggregation running total, without using GROUP BY. Let us break down the syntax and see how it works.

The first part of this aggregation, looks a lot like any other aggregation, adding OVER designates it as a Window function. You could read the above aggregation as; take the sum of standard quantity across all rows leading up to a given row , in-order by occurred at.

Let us say that instead we want to start the running total OVER at the beginning of each month. To narrow the window from the entire dataset to individual groups within the dataset we use the PARTITION BY function.

Now, this query groups and orders the query at the month in which the transactions occurred.Within each month, it is ordered by occurred at and the running total sums across that current row and all previous rows of standard quantity. That is what happens when you group using PARTITION BY.

In case you are still stumped by ORDER BY, it simply orders the designated columns the same way the ORDER BY clause would except that it treats every partition as separate. It also creates the running total, without ORDER BY each value will simply be a sum of all standard quantity values in its respective month.

To get a clear picture, here is what happens when we run this query with ORDER BY .

The ORDER and PARTITION defines what is referred to as the Window ; the ordered subset of data where all this calculation are made !

You have seen why one of the most powerful concepts in SQL data analysis is the window function, it will take a little practice for you to really nail it and see it become extremely useful. Making you a world class data analyst in the process because most SQL education end before the window function.

Congratulations for reading to the end. If you find this useful , clap ! so that other folks can see it easier.

--

--