Mastering Top 3 SQL Window Functions: SQL 101

Sheela Padhy
6 min readSep 18, 2023

--

Window functions are crucial in SQL for data analysis and reporting. They enable efficient ranking, grouping, and analytical calculations within result sets and are MUST to be known in interviews and real world.

Top 3 Functions

  • RANK assigns ranks to rows based on specific criteria.
  • DENSE_RANK provides consecutive ranks for tied values
  • ROW_NUMBER generates unique identifiers for each row

These functions are essential for tasks such as finding top performers, identifying duplicates, and creating meaningful reports.

Music Analytics — Ranking Top Musicians by Style and Rating

“Imagine you’re organizing a music competition, and your job is to assign rankings to a group of talented musicians based on their performance.

Lets explore the captivating world of RANK, DENSE_RANK, and ROW_NUMBER in SQL. Just as each musician in the competition has a unique sound, each of these window functions has its own way of determining how data should be ranked.

--Below script works on MySQL.

create database music_competition;
use music_competition;

CREATE TABLE musician_ratings (
musician_id INT AUTO_INCREMENT PRIMARY KEY,
musician_name VARCHAR(255) NOT NULL,
music_style VARCHAR(100) NOT NULL,
rating DECIMAL(3, 1) NOT NULL
);

INSERT INTO musician_ratings (musician_name, music_style, rating)
VALUES
('Musician 1', 'Rock', 8.5),
('Musician 2', 'Pop', 7.8),
('Musician 3', 'Rock', 9.0),
('Musician 4', 'Jazz', 8.2),
('Musician 5', 'Classical', 7.1),
('Musician 6', 'Pop', 8.9),
('Musician 7', 'Hip-Hop', 7.5),
('Musician 8', 'Rock', 8.7),
('Musician 9', 'Blues', 7.3),
('Musician 10', 'Jazz', 9.5),
('Musician 11', 'Country', 6.8),
('Musician 12', 'Electronic', 8.4),
('Musician 13', 'R&B', 7.2),
('Musician 14', 'Blues', 8.1),
('Musician 15', 'Folk', 7.6),
('Musician 16', 'Classical', 9.2),
('Musician 17', 'Pop', 7.9),
('Musician 18', 'Jazz', 9.7),
('Musician 19', 'Rock', 8.8),
('Musician 20', 'Hip-Hop', 7.4),
('Musician 21', 'R&B', 8.3),
('Musician 22', 'Country', 6.6),
('Musician 23', 'Classical', 8.0),
('Musician 24', 'Pop', 7.7),
('Musician 25', 'Blues', 8.1),
('Musician 26', 'Rock', 9.1),
('Musician 27', 'Jazz', 8.4),
('Musician 28', 'Electronic', 7.9),
('Musician 29', 'Folk', 7.0),
('Musician 30', 'Classical', 8.9),
('Musician 31', 'Pop', 7.2),
('Musician 32', 'Jazz', 9.3),
('Musician 33', 'Hip-Hop', 7.6),
('Musician 34', 'Rock', 8.6),
('Musician 35', 'Blues', 7.5),
('Musician 36', 'R&B', 8.0),
('Musician 37', 'Country', 6.7),
('Musician 38', 'Electronic', 8.2),
('Musician 39', 'Pop', 7.4),
('Musician 40', 'Classical', 8.8),
('Musician 41', 'Jazz', 9.6),
('Musician 42', 'Rock', 8.9),
('Musician 43', 'Folk', 6.9),
('Musician 44', 'Electronic', 8.5),
('Musician 45', 'Jazz', 9.4),
('Musician 46', 'Blues', 7.7),
('Musician 47', 'Country', 6.5),
('Musician 48', 'R&B', 8.1),
('Musician 49', 'Pop', 7.3),
('Musician 50', 'Rock', 8.4);

select * from musician_ratings ;
Musician ratings sample Data
Musician ratings sample data

RANK()

The RANK() function in SQL assigns a rank to each row within a result set based on a specified column’s values, with equal values receiving the same rank, and it may skip rank values for ties, resulting in non-consecutive ranks.


--with partition by
SELECT
musician_id,
musician_name,
music_style,
rating,
RANK() OVER (PARTITION BY music_style ORDER BY rating DESC) AS ranking
FROM
musician_ratings;

Rank() with partition by

The PARTITION BY clause divides the result set into partitions or groups based on the “music_style” column. Within each partition, the rows are ordered by the “rating” column in descending order (highest rating first). The RANK() function then assigns a rank to each musician within their music style group based on their rating.

In the above example,rank function assigned rank 1 to first 2 rows as the rating is same for music style ‘Blues’.The next rank assigned is 3 because the rank function skips the consecutive number(i.e. rank 2)

--without partition by

SELECT
musician_id,
musician_name,
music_style,
rating,
RANK() OVER (ORDER BY rating DESC) AS ranking
FROM
musician_ratings;
Rank() without partition by

When you use RANK() without the PARTITION BY clause, it assigns a unique rank to each row in the entire result set based on the specified ORDER BY criteria. Rows with the same values in the ORDER BY columns will receive the same rank, and the next rank will be skipped

In the above example,rank function assigned rank 9 to 3 rows as the rating is same for them.The next rank assigned is 12 because the rank function skips the consecutive number(i.e. rank 10 and 11)

DENSE_RANK()

The DENSE_RANK() function in SQL assigns a rank to each row within a result set based on a specified column’s values, with equal values receiving the same rank, without skipping rank values for ties, resulting in consecutive ranks.

--with partition by
SELECT
musician_id,
musician_name,
music_style,
rating,
DENSE_RANK() OVER (PARTITION BY music_style ORDER BY rating DESC) AS dense_rank
FROM
musician_ratings;
Dense Rank() with partition by

The PARTITION BY clause divides the result set into partitions or groups based on the “music_style” column. Within each partition, the rows are ordered by the “rating” column in descending order (highest rating first). The DENSE_RANK() function then assigns a rank to each musician within their music style group based on their rating.

In the above example,dense_rank function assigned rank 1 to first 2 rows as the rating is same for music style ‘Blues’.The next rank assigned is 2 because the rank function does not skips the consecutive number(i.e. rank 2)

--without partition by
SELECT
musician_id,
musician_name,
music_style,
rating,
DENSE_RANK() OVER (ORDER BY rating DESC) AS dense_rank
FROM
musician_ratings;
Dense Rank() without partition by

When you use DENSE_RANK() without the PARTITION BY clause, it assigns a unique rank to each row in the entire result set based on the specified ORDER BY criteria. Rows with the same values in the ORDER BY columns will receive the same rank, and the next rank will NOT be skipped

In the above example,dense_rank function assigned rank 9 to 3 rows as the rating is same for them.The next rank assigned is 10 because the dense_rank function does not skip the consecutive number.

ROW_NUMBER()

The ROW_NUMBER() function in SQL assigns a unique sequential number to each row within a result set, regardless of the values in any specific column, allowing for distinct identification of each row.

--with partition by
SELECT
musician_id,
musician_name,
music_style,
rating,
ROW_NUMBER() OVER (PARTITION BY music_style ORDER BY rating DESC) AS row_number
FROM
musician_ratings;

Row() with partition by

The row_number function assigns unique row number to each row partitioned by music_style.

--without partition by
SELECT
musician_id,
musician_name,
music_style,
rating,
ROW_NUMBER() OVER (music_style ORDER BY rating DESC) AS row_number
FROM
musician_ratings;
Row() without partition by

The row_number function assigns unique row number to each row on the entire result set.

RANK() vs DENSE_RANK() vs ROW_NUMBER()

  • Rank: When you want to assign unique ranks but allow for ties to skip ranks
  • Dense rank : When you want to assign ranks to rows with equal values without gaps in rank numbers.
  • Row number : When you need unique identifiers for each row and don’t want to skip numbers, regardless of ties.
SELECT
musician_id,
musician_name,
music_style,
rating,
RANK() OVER (ORDER BY rating DESC) AS ranking,
DENSE_RANK() OVER (ORDER BY rating DESC) AS den_rank,
ROW_NUMBER() OVER (ORDER BY rating DESC) AS row_num
FROM
musician_ratings;
RANK() vs DENSE_RANK() vs ROW_NUMBER()

The RANK(), DENSE_RANK(), and ROW_NUMBER() functions are particularly useful for ranking data, identifying outliers, and calculating moving averages and other technical indicators.

--

--

Sheela Padhy

Data Science Enthusiast |Transforming Data into Insights | Actively Exploring Opportunities