Category: Hard(Advanced SQL Joins)
Problem statement:
Samantha interviews many candidates from different colleges using coding challenges and contests. Write a query to print the contest_id, hacker_id, name, and the sums of total_submissions, total_accepted_submissions, total_views, and total_unique_views for each contest sorted by contest_id. Exclude the contest from the result if all four sums are 0.
Note: A specific contest can be used to screen candidates at more than one college, but each college only holds screening contest.
Let’s create the tables at first.
CREATE TABLE contests (
`contest_id` integer,
`hacker_id` integer,
`name` varchar(20)
);
CREATE TABLE colleges (
college_id integer,
contest_id integer
);
CREATE TABLE challenges (
challenge_id integer,
college_id integer
);
CREATE TABLE view_stats (
challenge_id integer,
total_views integer,
total_unique_views integer
);
CREATE TABLE submission_stats (
challenge_id integer,
total_submissions integer,
total_accepted_submissions integer
);
INSERT into contests values
(66406,17973,'Rose'),
(66556,79153,'Angela'),
(94828,80275,'Frank');
INSERT into colleges values
(11219,66406),
(32473,66556),
(56685,94828);
INSERT into challenges values
(18765,11219),
(47127,11219),
(60292,32473),
(72974,56685);
INSERT into view_stats values
(47127, 26, 19),
(47127, 15, 14),
(18765, 43, 10),
(18765, 72, 13),
(75516, 35, 17),
(60292, 11, 10),
(72974, 41, 15),
(75516, 75, 11);
INSERT into submission_stats values
(75516, 34 , 12),
(47127, 27 , 10),
(47127, 56 , 18),
(75516, 74 , 12),
(75516, 83 , 8),
(72974, 68 , 24),
(72974, 82 , 14),
(47127, 28 , 11);
Output will be for each contest:
contest_id,
hacker_id,
name,
sum(total_submissions),
sum(total_accepted_submissions),
sum(total_views),
sum(total_unique_views)
We can notice that contests table has no direct connection with view_stats and submission_stats table. But contests table has connection with colleges table(common field: contest_id). Colleges table has connection with challenges table(common field: college_id). Finally challenges table has connection with view_stats and submission_stats table (common field: challenge_id)
So first we have to build a connection between contests table to challenges table.
select
con.contest_id,
con.hacker_id,
con.name,
cha.challenge_id
from
contests con
join colleges col on con.contest_id = col.contest_id
join challenges cha on col.college_id = cha.college_id;
result-1:
So in contest 66406, challenge 18765, 47127 were displayed;
in contest 66556, challenge 60292 was displayed; in contest 94828, challenge 72974 was displayed.
No we have to find out these challenges’ total_submissions, total_accepted_submissions from submission_stats and total_views,
total_unique_views from view_stats.
We’re noticing there are multiple rows for each challenge_id in submission_stats and view_stats table. So we need to merge them first.
select
challenge_id,
sum(total_submissions) as total_submissions,
sum(total_accepted_submissions) as total_accepted_submissions
from
submission_stats
group by
challenge_id;
result-2:
select
challenge_id,
sum(total_views) as total_views,
sum(total_unique_views) as total_unique_views
from
view_stats
group by
challenge_id;
result-3:
Now we need to join result-1, result-2, result-3.
SELECT con.contest_id
,con.hacker_id
,con.name
,ss.total_submissions
,ss.total_accepted_submissions
,vs.total_views
,vs.total_unique_views
FROM contests con
JOIN colleges col ON con.contest_id = col.contest_id
JOIN challenges cha ON col.college_id = cha.college_id
LEFT JOIN (
SELECT challenge_id
,sum(total_views) AS total_views
,sum(total_unique_views) AS total_unique_views
FROM view_stats
GROUP BY challenge_id
) vs ON cha.challenge_id = vs.challenge_id
LEFT JOIN (
SELECT challenge_id
,sum(total_submissions) AS total_submissions
,sum(total_accepted_submissions) AS total_accepted_submissions
FROM submission_stats
GROUP BY challenge_id
) ss ON cha.challenge_id = ss.challenge_id;
There are some null value in the result. So we need to convert them into zero by ISNULL(value, 0). To do this, we modify the query like:
SELECT con.contest_id
,con.hacker_id
,con.name
,ISNULL(ss.total_submissions, 0) total_submissions
,ISNULL(ss.total_accepted_submissions, 0) total_accepted_submissions
,ISNULL(vs.total_views, 0) total_views
,ISNULL(vs.total_unique_views, 0) total_unique_views
FROM contests con
JOIN colleges col ON con.contest_id = col.contest_id
JOIN challenges cha ON col.college_id = cha.college_id
LEFT JOIN (
SELECT challenge_id
,sum(total_views) AS total_views
,sum(total_unique_views) AS total_unique_views
FROM view_stats
GROUP BY challenge_id
) vs ON cha.challenge_id = vs.challenge_id
LEFT JOIN (
SELECT challenge_id
,sum(total_submissions) AS total_submissions
,sum(total_accepted_submissions) AS total_accepted_submissions
FROM submission_stats
GROUP BY challenge_id
) ss ON cha.challenge_id = ss.challenge_id;
So now we have a result where contests table has connection with view_stats and submission_stats table. We can name the result as interviews.
WITH interviews
AS (
SELECT con.contest_id as contest_id
,con.hacker_id as hacker_id
,con.name as name
,ISNULL(ss.total_submissions, 0) total_submissions
,ISNULL(ss.total_accepted_submissions, 0) total_accepted_submissions
,ISNULL(vs.total_views, 0) total_views
,ISNULL(vs.total_unique_views, 0) total_unique_views
FROM contests con
JOIN colleges col ON con.contest_id = col.contest_id
JOIN challenges cha ON col.college_id = cha.college_id
LEFT JOIN (
SELECT challenge_id
,sum(total_views) AS total_views
,sum(total_unique_views) AS total_unique_views
FROM view_stats
GROUP BY challenge_id
) vs ON cha.challenge_id = vs.challenge_id
LEFT JOIN (
SELECT challenge_id
,sum(total_submissions) AS total_submissions
,sum(total_accepted_submissions) AS total_accepted_submissions
FROM submission_stats
GROUP BY challenge_id
) ss ON cha.challenge_id = ss.challenge_id
)
SELECT *
FROM interviews;
Finally, from interviews table, we will find out contest_id, hacker_id, name, sum(total_submissions), sum(total_accepted_submissions), sum(total_view), sum(total_unique_views) for each contest where all four sums are greater than zero order by contest
WITH interviews
AS (
SELECT con.contest_id AS contest_id
,con.hacker_id AS hacker_id
,con.name AS name
,ISNULL(ss.total_submissions, 0) total_submissions
,ISNULL(ss.total_accepted_submissions, 0) total_accepted_submissions
,ISNULL(vs.total_views, 0) total_views
,ISNULL(vs.total_unique_views, 0) total_unique_views
FROM contests con
JOIN colleges col ON con.contest_id = col.contest_id
JOIN challenges cha ON col.college_id = cha.college_id
LEFT JOIN (
SELECT challenge_id
,sum(total_views) AS total_views
,sum(total_unique_views) AS total_unique_views
FROM view_stats
GROUP BY challenge_id
) vs ON cha.challenge_id = vs.challenge_id
LEFT JOIN (
SELECT challenge_id
,sum(total_submissions) AS total_submissions
,sum(total_accepted_submissions) AS total_accepted_submissions
FROM submission_stats
GROUP BY challenge_id
) ss ON cha.challenge_id = ss.challenge_id
)
SELECT contest_id
,hacker_id
,name
,sum(total_submissions)
,sum(total_accepted_submissions)
,sum(total_views)
,sum(total_unique_views)
FROM interviews
GROUP BY contest_id
,hacker_id
,name
HAVING sum(total_submissions) + sum(total_accepted_submissions) +
sum(total_views) + sum(total_unique_views) > 0
ORDER BY contest_id;
Problem link: https://www.hackerrank.com/challenges/interviews/problem
If you find this helpful, share it with your friends. If you have any question, knock me on linkedin
Ifta Khairul Alam
Software Engineer at Goava