SQL HackerRank : Interviews

Shefali Bisht
Geek Culture
Published in
6 min readJan 6, 2022

Hard : Advanced SQL Joins

Problem Statement : Samantha interviews many candidates from different colleges using coding challenges and contests. Write a query to print the contest_id, hacker_id, name, and the sums of total_submissions, total_accepted_submissions, total_views, and total_unique_views for each contest sorted by contest_id. Exclude the contest from the result if all four sums are 0.

Note: A specific contest can be used to screen candidates at more than one college, but each college only holds 1 screening contest.

Link to HackerRank problem.

Input Format

The following tables hold interview data:

  • Contests: The contest_id is the id of the contest, hacker_id is the id of the hacker who created the contest, and name is the name of the hacker.
  • Colleges: The college_id is the id of the college, and contest_id is the id of the contest that Samantha used to screen the candidates.
  • Challenges: The challenge_id is the id of the challenge that belongs to one of the contests whose contest_id Samantha forgot, and college_id is the id of the college where the challenge was given to candidates.
  • View_Stats: The challenge_id is the id of the challenge, total_views is the number of times the challenge was viewed by candidates, and total_unique_views is the number of times the challenge was viewed by unique candidates.
  • Submission_Stats: The challenge_id is the id of the challenge, total_submissions is the number of submissions for the challenge, and total_accepted_submission is the number of submissions that achieved full scores.

Solution and its walkthrough :-

The overall workflow looks something like this :

Relationship Diagram

Sample dataset for Contests, Colleges and Challenges tables :

Sample Dataset 1

Sample dataset for View_Stats and Submission_Stats tables :

Sample Dataset 2

Step 1 : Lets take Contest_id: 66406. We want to get all Colleges where Contest_id: 66406 was held.

Only college_id : 11219 held Contest_id: 66406.

Part 1

Query:-

SELECT con.contest_id, con.hacker_id, con.name, col.college_id 
FROM Contests con
inner join Colleges col on con.contest_id = col.contest_id

Step 2 :

Now college_id : 11219 had two challenges with challenge_id 18765 and 47127

Part 2

Query:-

SELECT con.contest_id, con.hacker_id, con.name, ch.challenge_id 
FROM Contests con
inner join Colleges col on con.contest_id = col.contest_id
inner join Challenges ch on col.college_id = ch.college_id

Step 3 : Find Total views and total submissions by challenge_id

Part 3-a

In our example, challenge_id : 18765 and 47127

Part 3-b

Now we have to join this result with our result from step 2.

Part 3-c

Query:-

WITH Sub_stats_cte as
(
select challenge_id, SUM(total_Submissions) as total_submissions, SUM(total_accepted_Submissions) as total_accepted_submissions
from Submission_Stats group by challenge_id
), View_stats_cte as
(
select challenge_id, SUM(total_views) as total_views, SUM(total_unique_views) as total_unique_views
from View_Stats group by challenge_id
)
SELECT con.contest_id, con.hacker_id, con.name, ch.challenge_id, total_submissions, total_accepted_submissions
total_views, total_unique_views
FROM Contests con
inner join Colleges col on con.contest_id = col.contest_id
inner join Challenges ch on col.college_id = ch.college_id
left join Sub_stats_cte c1 ON ch.challenge_id = c1.challenge_id
left join View_stats_cte c2 ON ch.challenge_id = c2.challenge_id

Step 4 : Aggregate the results

Now we require the sums of total_submissions, total_accepted_submissions, total_views, and total_unique_views for each contest sorted by contest_id.

total_views = 115 + 41 = 155
total_unique_views = 23 + 33 = 56
total_submissions = NULL (0) + 111 = 111
total_accepted_submissions = NULL(0) + 39 = 39

Part 4

Query:-

WITH Sub_stats_cte as
(
select challenge_id, SUM(total_Submissions) as total_submissions, SUM(total_accepted_Submissions) as total_accepted_submissions
from Submission_Stats group by challenge_id
), View_stats_cte as
(
select challenge_id, SUM(total_views) as total_views, SUM(total_unique_views) as total_unique_views
from View_Stats group by challenge_id
)
SELECT con.contest_id, con.hacker_id, con.name, ISNULL(SUM(total_submissions),0) total_submissions,
ISNULL(SUM(total_accepted_submissions),0) total_accepted_submissions,
ISNULL(SUM(total_views),0) total_views, ISNULL(SUM(total_unique_views),0) total_unique_views
FROM Contests con
inner join Colleges col on con.contest_id = col.contest_id
inner join Challenges ch on col.college_id = ch.college_id
LEFT JOIN Sub_stats_cte c1 ON ch.challenge_id = c1.challenge_id
LEFT JOIN View_stats_cte c2 ON ch.challenge_id = c2.challenge_id
GROUP BY con.contest_id, con.hacker_id, con.name
HAVING SUM(total_submissions)>0
OR SUM(total_accepted_submissions)>0
OR SUM(total_views)>0
OR SUM(total_unique_views)>0
ORDER BY contest_id;

Note : We have to exclude the contest from the result if all four sums are 0. That is why we have used HAVING clause.

Below is the result of the above query on the given dataset.

I have prepared below script if anyone wants to try out this question along with the sample dataset. This will eliminate the hassle of creating a dummy dataset from scratch :D

with Contests as
(
select 66406 as contest_id, 17973 as hacker_id,'Rose' as name union all
select 66556 as contest_id, 79153 as hacker_id,'Angela' as name union all
select 94828 as contest_id, 80275 as hacker_id,'Frank' as name
), Colleges as
(
select 11219 as college_id, 66406 as contest_id union all
select 32473 as college_id, 66556 as contest_id union all
select 56685 as college_id, 94828 as contest_id
), Challenges as
(
select 18765 as challenge_id, 11219 as college_id union all
select 47127 as challenge_id, 11219 as college_id union all
select 60292 as challenge_id, 32473 as college_id union all
select 72974 as challenge_id, 56685 as college_id
), View_Stats as
(
select 47127 as challenge_id, 26 as total_views, 19 as total_unique_views union all
select 47127 as challenge_id, 15 as total_views, 14 as total_unique_views union all
select 18765 as challenge_id, 43 as total_views, 10 as total_unique_views union all
select 18765 as challenge_id, 72 as total_views, 13 as total_unique_views union all
select 75516 as challenge_id, 35 as total_views, 17 as total_unique_views union all
select 60292 as challenge_id, 11 as total_views, 10 as total_unique_views union all
select 72974 as challenge_id, 41 as total_views, 15 as total_unique_views union all
select 75516 as challenge_id, 75 as total_views, 11 as total_unique_views
), Submission_Stats as
(
select 75516 as challenge_id, 34 as total_Submissions, 12 as total_accepted_Submissions union all
select 47127 as challenge_id, 27 as total_Submissions, 10 as total_accepted_Submissions union all
select 47127 as challenge_id, 56 as total_Submissions, 18 as total_accepted_Submissions union all
select 75516 as challenge_id, 74 as total_Submissions, 12 as total_accepted_Submissions union all
select 75516 as challenge_id, 83 as total_Submissions, 8 as total_accepted_Submissions union all
select 72974 as challenge_id, 68 as total_Submissions, 24 as total_accepted_Submissions union all
select 72974 as challenge_id, 82 as total_Submissions, 14 as total_accepted_Submissions union all
select 47127 as challenge_id, 28 as total_Submissions, 11 as total_accepted_Submissions
), Sub_stats_cte as
(
select challenge_id, SUM(total_Submissions) as total_submissions, SUM(total_accepted_Submissions) as total_accepted_submissions
from Submission_Stats group by challenge_id
), View_stats_cte as
(
select challenge_id, SUM(total_views) as total_views, SUM(total_unique_views) as total_unique_views
from View_Stats group by challenge_id
)
SELECT con.contest_id, con.hacker_id, con.name, ISNULL(SUM(total_submissions),0) total_submissions,
ISNULL(SUM(total_accepted_submissions),0) total_accepted_submissions,
ISNULL(SUM(total_views),0) total_views, ISNULL(SUM(total_unique_views),0) total_unique_views
FROM Contests con
inner join Colleges col on con.contest_id = col.contest_id
inner join Challenges ch on col.college_id = ch.college_id
LEFT JOIN Sub_stats_cte T1 ON ch.challenge_id = T1.challenge_id
LEFT JOIN View_stats_cte T2 ON ch.challenge_id = T2.challenge_id
GROUP BY con.contest_id, con.hacker_id, con.name
HAVING SUM(total_submissions)>0
OR SUM(total_accepted_submissions)>0
OR SUM(total_views)>0
OR SUM(total_unique_views)>0
ORDER BY contest_id;

If you found this article helpful, share it with your friends and colleagues. If you have any other questions, you can find me on Linkedin .

--

--

Shefali Bisht
Geek Culture

Data Engineer who loves experimenting with different datasets and technologies to make your life easy and mine complex. https://www.shefalibisht.com/