SQL “OVER()” Clause: A Short 3-Minute Lecture

The Ultimate Guide to Getting the Basics Right.

Richard Warepam
ILLUMINATION
3 min readMay 12, 2024

--

Photo by Boitumelo on Unsplash

Are you gearing up for a data science interview? Or perhaps you’re just keen to understand the basics of the OVER() clause in SQL?

Either way, you’re in the right place. I’m here to break down everything you need to know about it in a simple and straightforward manner. So, let’s dive in.

🔌Plug-in (40 % OFF on all products)

Image by Author

Learn data science with me. I have authored these e-books for data science and AI tools:

Use Code: “MEDIUM40”

  1. Personal INTERVIEW Ready “SQL” CheatSheet
  2. Personal INTERVIEW Ready “Statistics” Cornell Notes
  3. ChatGPT for Learning Data Science.

Best Selling eBook: Top 50+ ChatGPT Personas for Custom Instructions

Cheapest Bundles:

  1. The Ultimate ChatGPT Bundle (Cheapest)
  2. The Ultimate Data Science Bundle (Cheapest)

The “Windows Function” is a crucial topic in SQL, and it’s a common subject in data science interviews. Therefore, it’s essential to grasp the core component of the “Windows function,” which is the OVER() clause.

So, what exactly is the OVER() clause, and when do we use it?

Essentially, it’s used to create a window frame for the table using the sub clause “PARTITION.” This sub-clause specifies the column of the table that should be used as the basis for dividing into window frames.

The key point here is that these window frames are then utilized for applying window functions like aggregate functions, ranking functions, and value functions.

Now, you might be wondering if there are any other SQL clauses that perform a similar function. Does anything come to mind?

Yes, that’s right — the “GROUP BY” clause. But what’s the difference?

Well, the GROUP BY clause compresses the rows and provides minimal information, rather than preserving the information in each row. In contrast, with the OVER clause and PARTITION, you can maintain the row-wise data along with the aggregated data or the data frames on which the Windows functions are applied.

Therefore, the main advantage of the OVER clause is that it prevents the loss of row-wise data while aggregating the data, partitioned as window frames. Now, you know when to use which of the two, right?

Components of the OVER clause:

SELECT col1, col2,
Windowfunctions(col3) OVER([PARTITION BY col_name]
[ORDER BY col_name]
[ROW or range clause])
FROM table_name

Well, as you see in the above syntax of OVER clause, the main components are:

  1. Window Functions: These are applied to window frames, which are defined by the OVER clause. A window frame is a group of rows that share a common condition.
  2. PARTITION BY Sub-Clause: This is the primary sub-clause that divides the rows into windows. For each row, the values of the applied window functions are computed.
  3. ORDER BY: This is used to sort the rows in the partition. By default, it arranges them in ascending order.
  4. ROWS or RANGE: These limit the rows from a starting point to an endpoint within a specific window. To use the ROWS and RANGE clauses, we need the ORDER BY clause as well. The RANGE and ROWS clauses are similar, but the key difference is that the ROWS clause considers duplicates, while the RANGE clause does not.

Wrapping Up:

I trust you’ve grasped the fundamentals of the OVER() clause. It’s used with a variety of window functions. If you’re keen to delve deeper into each function, let me know in the comments. Based on your responses, I’ll write an article to explain them.

In the meantime, follow me, peruse my articles, invest in my eBooks, continue your data science learning journey, and, most importantly, enjoy life.

Access to my eBooks: https://codewarepam.gumroad.com/
Subscribe for more data science case studies:
https://yourdataguide.substack.com/

--

--

Richard Warepam
ILLUMINATION

Worked as Developer | Passionate about Data Science | Writes on Data Science (AI/ML) | Learn A/B Testing for FREE: https://codewarepam.gumroad.com/l/mzqecj