15 days of learning SQL : HackerRank

Shefali Bisht
Geek Culture
Published in
5 min readJan 23, 2022

Hard : Advanced SQL Joins

Problem Statement: Julia conducted a days of learning SQL contest. The start date of the contest was March 01, 2016 and the end date was March 15, 2016.
Write a query to print total number of unique hackers who made at least submission each day (starting on the first day of the contest), and find the hacker_id and name of the hacker who made maximum number of submissions each day. If more than one such hacker has a maximum number of submissions, print the lowest hacker_id. The query should print this information for each day of the contest, sorted by the date.

Input Format: The following tables hold contest data:

  • Hackers: The hacker_id is the id of the hacker, and name is the name of the hacker.
  • Submissions: The submission_date is the date of the submission, submission_id is the id of the submission, hacker_id is the id of the hacker who made the submission, and score is the score of the submission.

Here is the Link to the original HackerRank question.

Lets break the problem into two parts :-

Part 1: Find the hacker_id of the hacker who made maximum number of submissions each day.

Part 2 : Get a list of unique hackers who made at least submission each day (starting on the first day of the contest)

For Part 1 : First we will query to get unique combination of submission_date and hacker_id, thus giving us the count of submissions made by each hacker in one particular day.

Query:-

select submission_date, hacker_id, count(1) as SubCount
from submissions
group by submission_date, hacker_id

However we want to get Hacker_id who made maximum number of submissions each day. So we will add a column named ‘Rn’ using RANK() window function.

OVER(order by SubCount desc, hacker_id) : Hacker_id with maximum number of submissions and will give lowest hacker_id in case more than one hacker has a maximum number of submissions.

with MaxSubEachDay as (
select submission_date,
hacker_id,
RANK() OVER(partition by submission_date order by SubCount desc, hacker_id) as Rn
FROM
(select submission_date, hacker_id, count(1) as SubCount
from submissions
group by submission_date, hacker_id
) subQuery
)

For Part 2: Get a list of unique hackers who made at least submission each day (starting on the first day of the contest)

This part is quite tricky and complex. We need to drill through it step by step.

For example lets take below input for our submissions. Here only Rohit has made..

Step 1 : Create a cte to add a new column which will keep track of day sequence. i.e. All submissions of ‘2016–03–01’ will have dayRn = 1, submissions of ‘2016–03–02’ will have dayRn = 2, submissions of ‘2016–03–03’ will have dayRn = 3 and so on..

DayWiseRank as (
select submission_date,
hacker_id,
DENSE_RANK() OVER(order by submission_date) as dayRn
from submissions
)

Step 2 : Now we will iterate row by row and will check if any hacker_id has a submission in previous days.
For example if a record’s submission_date is ‘2016–03–03’, we will check if the hacker has at least one submission in all the dates earlier than ‘2016–03–03’. i.e. ‘2016–03–02’ and ‘2016–03–01’

HackerCntTillDate as (
select outtr.submission_date,
outtr.hacker_id,
case when outtr.submission_date='2016-03-01' then 1
else 1+(select count(distinct a.submission_date) from submissions a where a.hacker_id = outtr.hacker_id and a.submission_date<outtr.submission_date)
end as PrevCnt,
outtr.dayRn
from DayWiseRank outtr
)

When submission_date=’2016–03–01' or dayRn = 1, we will set default value of PrevCnt=1 because all the hackers on the first day who made at least one submission should be counted.

Step 3: Now we will count unique hacker_id for each day.
Note: We have used a where condition ‘where PrevCnt = dayRn’.
The reason is, for a hacker to be considered as ‘hacker who made at least one submission each day’, the count of distinct ‘submission_date’ till date should match the current submission date.
In our example, for date ’2016–03–03', ‘Mira’ has made a submission. Although she has also made submission in ’2016–03–02' but she has not made any in ’2016–03–01'. Hence, she does not qualify as ‘hacker who made at least one submission each day’.

HackerSubEachDay as (
select submission_date,
count(distinct hacker_id) HackerCnt
from HackerCntTillDate
where PrevCnt = dayRn
group by submission_date
)

Time to combine Part 1 and Part 2 (Phewwww!!)

We will put all the cte(s) together and join HackerSubEachDay, MaxSubEachDay with Hackers table to get corresponding names.

Remember: We have to apply ‘where MaxSubEachDay.Rn=1’ as we want only Rank=1 hackers i.e. who made maximum number of submissions on a particular day.

with MaxSubEachDay as (
select submission_date,
hacker_id,
RANK() OVER(partition by submission_date order by SubCount desc, hacker_id) as Rn
FROM
(select submission_date, hacker_id, count(1) as SubCount
from submissions
group by submission_date, hacker_id
) subQuery
), DayWiseRank as (
select submission_date,
hacker_id,
DENSE_RANK() OVER(order by submission_date) as dayRn
from submissions
), HackerCntTillDate as (
select outtr.submission_date,
outtr.hacker_id,
case when outtr.submission_date='2016-03-01' then 1
else 1+(select count(distinct a.submission_date) from submissions a where a.hacker_id = outtr.hacker_id and a.submission_date<outtr.submission_date)
end as PrevCnt,
outtr.dayRn
from DayWiseRank outtr
), HackerSubEachDay as (
select submission_date,
count(distinct hacker_id) HackerCnt
from HackerCntTillDate
where PrevCnt = dayRn
group by submission_date
)
select HackerSubEachDay.submission_date,
HackerSubEachDay.HackerCnt,
MaxSubEachDay.hacker_id,
Hackers.name
from HackerSubEachDay
inner join MaxSubEachDay
on HackerSubEachDay.submission_date = MaxSubEachDay.submission_date
inner join Hackers
on Hackers.hacker_id = MaxSubEachDay.hacker_id
where MaxSubEachDay.Rn=1

Finally the moment we all have been waiting for…..

If you found this article helpful, share it with your friends and colleagues. If you have any other questions, you can find me on Linkedin .

--

--

Shefali Bisht
Geek Culture

Data Engineer who loves experimenting with different datasets and technologies to make your life easy and mine complex. https://www.shefalibisht.com/