SQL Data Interview Series: Differences between Rank, Dense Rank and Row Number Window Functions

Maria-Goretti Anike
4 min readDec 5, 2023

--

Hello everyone. Welcome to Day 3 of the SQL Data Analyst Interview Questions series. You can read up yesterday’s topic — Views in SQL — here. Today, after watching Techtfq’s “Top 25 SQL Interview Questions”, I decided to answer Question 8, which states: ‘What are the differences between RANK, DENSE_RANK and ROW_NUMBER window functions?’ Let’s get to it.

We’ll start with the ROW_NUMBER function.
This function was introduced in SQL Server 2005. It returns the sequential number of a row starting at 1. The ORDER BY clause is required for this, while the PARTITION BY clause is optional (The row number is generated based on the ordering imposed by the ORDER BY clause). When the data is partitioned, row number is reset to 1 when the partition changes.
Syntax: ROW_NUMBER() OVER (ORDER BY Col 1,…). This function is great in fishing out and deleting all duplicate rows except 1.

RANK and DENSE_RANK functions:
• Introduced in SQL Server 2005
• Return a rank starting at 1 based on the ordering of rows imposed by the ORDER BY clause
• ORDER BY clause is required
• PARTITION BY clause is optional
• When the data is partitioned, rank is reset to 1 when the partition changes.
Syntax: RANK() OVER (ORDER BY Col1,…)
DENSE_RANK() OVER (ORDER BY Col1,…)
Use Case: Both functions can be used, for example, to find Nth highest salary. However, which function to use depends on what one would want to do when there is a tie.

Similarities between these three functions:
• Return an increasing integer value starting at 1 based on the ordering of rows imposed by the ORDER BY clause (if there are no ties)
• ORDER BY clause is required
• PARTITION BY is optional
• When the data is partitioned, the integer value is reset to 1 when the partition changes

Differences between these three functions:
• The ROW_NUMBER function returns an increasing unique number for each row starting at 1, even if there are duplicates.
• The RANK function returns an increasing unique number for each row starting at 1. When there are duplicates, same rank is assigned to all the duplicate rows, but the next row after the duplicate rows will have the rank it would have been assigned if there had been no duplicates. So RANK function skips rankings if there are duplicates.
• The DENSE_RANK function returns an increasing unique number for each row starting at 1. When there are duplicates, same rank is assigned to all the duplicate rows, but the DENSE_RANK function will not skip any ranks. This means the next row after the duplicate rows will have the next rank in the sequence.

Simply put, imagine we have a table with 5 rows in total, and the first two rows are duplicates. For each of these functions, we’d get:
ROW_NUMBER: 1, 2, 3, 4, 5
RANK: 1, 1, 3, 4, 5
DENSE_RANK: 1, 1, 2, 3, 4

Let’s take a better look at this using our SQL for Marketing dataset.

From this, we can see that the following query:

SELECT ID, Education, Income,
ROW_NUMBER () OVER (ORDER BY INCOME DESC) RowNumber,
RANK () OVER (ORDER BY INCOME DESC) Rank,
DENSE_RANK () OVER (ORDER BY INCOME DESC) DenseRank
FROM SQLDevotee.dbo.Marketing_Campaign

has given us three different columns with rankings. Now, take a look at
№s 14 & 15, with IDs 4248 and 7451 respectively. Here, we see that these customers earn the same income — 98777. In the order of rankings, ROW_NUMBER does not acknowledge this similarity, thereby giving different rankings, 14 & 15, to these customers based on this income, as it does not recognize duplicates. Therefore, ROW_NUMBER: 13, 14, 15, 16, 17.

On the other hand, RANK recognizes this and assigns the same ranking for this similar income, 14. It gives the same ranking to these duplicate rows, then skips the ranking ‘15’, as this 15th person has been ranked 14th already. Therefore, RANK: 13, 14, 14, 16, 17.

DENSE_RANK is similar to RANK in that it acknowledges these duplicates, but it differs from RANK as it does not skip any ranking. As RANK gives 14 in two places and skips to 16, DENSE_RANK gives 14 in two places and still assigns 15 to the next customer. Therefore, DENSE_RANK: 13, 14, 14, 15, 16.

Another example. Let’s create a new table ‘CustomerIncome’.

Now let’s rank these customers by their income.

In this, ROW_NUMBER does not recognize the duplicate rows and keeps assigning ranks sequentially. RANK recognizes and assigns ‘2’ to the two duplicates, skips ‘3’ and moves on to ‘4’. DENSE_RANK recognizes and assigns ‘2’ to the two duplicates, but does not skip ‘3’.

I hope you grasp the concept of these functions better now. Do leave lots of claps and encouraging comments. Tomorrow, we’ll be discussing Subqueries in SQL. 🤗

GIF from GIPHY

--

--

Maria-Goretti Anike

Hey yo there 😄! I'm Maria, your favourite Content Writer, Data Explorer & ardent SQL devotee. I write all about B2B, B2C, SaaS and Marketing/Product Analytics.