SQL Rank Functions OverSimplified
Is SQL needed for Business Professionals?
SQL is one of the easiest computer languages to learn. These days more and more companies are encouraging their employees in non-IT areas (like sales, advertising, and finance) to learn and use SQL. When business is empowered to answer data-related questions themselves, the decision makers can then transform these insights into actionable items and increase productivity. If you are interested to know more about SQL and you are familiar with the basics, then this article will help you quickly understand some awesome SQL Functions that will come in very handy.
Introduction on Window Functions
Why it called Window? Window here describes the set of rows over which the function operates.
What does it do? It performs a bunch of calculations on a set of table rows that somehow relate to the current row.
What happens behind the scene? It works like an aggregate function but what is pertinent to note here is, unlike aggregate functions, the rows do not just get grouped into a single output row, and instead, those rows still retain their separate identities.
Although Window functions broadly cover all these functions, we will just focus on Ranking today.
- Aggregate Functions: MAX(), MIN(), AVG(), SUM(), COUNT(), etc.
- Ranking Functions: ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(), PERCENT_RANK(), CUME_DIST() etc.
- Analytic Functions:LEAD(), LAG(), FIRST_VALUE(), LAST_VALUE(), etc.
- Rank with Partition
- Rank without Partition
- Difference between Rank & Dense Rank
- Row Number
1. Use Case for Rank with Partition (When you need to see a subset of data together)
Imagine you have twins -Isabelle and Olivia. They study in the same class. To give them the parent talk regarding their career path, you will have to see their scores.
Business Logic→ Circling back to the data problem, you will need to see Isabella’s subjects and scores together, followed by Olivia’s subjects and scores.
SQL Logic→ This business logic should lead you to think about how you can compartmentalize based on the Student's Name.
Business Logic → If you want to first see the subjects where the child scored the most followed by the subject where the child scored the second then you will have to start thinking about ranks.
SQL Logic→ Rank 1 allocated to the subject where the child scored the highest. Hint is to arrange in descending order.
SELECT Studentname,Subject,Marks,RANK() OVER(PARTITION BY Studentname ORDER BY Marks DESC) RankFROM ExamResultORDER BY Studentname,Rank;
- To separate out the children, we use PARTITION BY Studentname clause so we can perform calculations on the marks scored by each student group. By each student group, I mean Isabella and Olivia.
- As a parent, you want to see their best performance first so lets order by Rank in descending
- We will also use Order By clause to sort results on Student name so you see Isabella first.
There you go! You can now give them a talk about literature.
2. Rank without Partition (Subset not needed, Rank is needed)
Growing up, I vividly remember my friend’s mother asking this question to my friend who did very well for herself.
“Hmmm… long pause … so how many other children, got the same marks as you did?How did the others perform when compared to you?“
In other words, we want to rank student marks across all subjects. See below.
- Results tell you that Isabella scored the highest in her class in English and Olivia is also equally good in English. Writers in the making!
2. Some parents might even look at Rank 3 to wish Rank 3 (in Science by Lily ) to be given to their child Isabella or Olivia.
We have not used partition by subject or student, because we just wanted to have a view of all subjects together. Instead, we have used SQL Rank function with OVER clause on Marks clause ( in descending order) to get ranks for respective rows.
Rank and Dense Rank Difference (Dense_Rank() gives a unique rank)
RANK skips positions after equal rankings. The number of positions skipped depends on how many rows had an identical ranking. For example, Isabella scored the same marks in both Math & science so both subjects are ranked as #1. With
RANK, the next position is #3; with
DENSE_RANK, the next position is #2.
Dense_Rank() → This gives you a unique rank number within the partition as per the specified column value.
This function gives a unique sequential number for each row (1, 2, 3, etc)It gives the rank one for the first row and then increments the value by one for each row.
The caveat here is that different ranks for the row have similar values as well.
SELECT Studentname,Subject,Marks,ROW_NUMBER() OVER(ORDER BY Marks) RowNumberFROM ExamResult;
Ranking functions are among the most common tools used by data analysts. To feel confident with ranking in SQL, please feel empowered & start practicing. You can check out SQL Fiddle or SQL Zoo. These are free platforms that present SQL in a way that is very friendly to beginners. It’s super intuitive & will help you get hands-on with SQL in a comfortable environment.