SQL Interview Questions PART -2

Gunjan Sahu
Nerd For Tech
Published in
7 min readJul 10, 2023

Microsoft

The Number of Seniors and Juniors to Join the Company II

A company wants to hire new employees. The budget of the company for the salaries is $70000. The company’s criteria for hiring are:

  1. Keep hiring the senior with the smallest salary until you cannot hire any more seniors.
  2. Use the remaining budget to hire the junior with the smallest salary.
  3. Keep hiring the junior with the smallest salary until you cannot hire any more juniors.

Write an SQL query to find the ids of seniors and juniors hired under the mentioned criteria.

Return the result table in any order.

create table candidates (
emp_id int,
experience varchar(20),
salary int
);

insert into candidates values
(1,'Junior',10000),(2,'Junior',15000),(3,'Junior',40000),(4,'Senior',16000),(5,'Senior',20000),(6,'Senior',50000);
SELECT * FROM candidates;

Solution:

-- Solution #1
WITH CTE AS (
SELECT employee_id,
experience,
SUM(salary) OVER(PARTITION BY experience ORDER BY salary ASC) AS RN
FROM Candidates
)

SELECT employee_id FROM CTE WHERE experience = 'Senior' AND RN < 70000
UNION
SELECT employee_id FROM CTE WHERE experience = 'Junior' AND RN < (SELECT 70000 - IFNULL(MAX(RN),0) FROM CTE WHERE experience = 'Senior' AND RN < 70000)

breaking down the solution:

SELECT employee_id, experience, SUM(salary) OVER(PARTITION BY experience ORDER BY salary,employee_id ASC) AS RN FROM Candidates
-- Solution #2
-- by Ankit Bansal
with total_sal as (
select *, sum(salary) over(partition by experience order by salary asc rows between unbounded preceding and current row) as running_sal
from candidates),
seniors as(select * from total_sal
where experience = 'Senior' and running_sal <= 70000)
select * from total_sal
where experience = 'Junior' and running_sal <= 70000 -(select sum(salary) from seniors)
UNION ALL
select * from seniors;

Output:

Walmart

First and Last Call On the Same Day

Write an SQL query to report the IDs of the user who had the first and the last call with the same person on any day.

Return the result table in any order.

The query result format is in the following example:

Table : Calls

+--------------+----------+
| Column Name | Type |
+--------------+----------+
| caller_id | int |
| recipient_id | int |
| call_time | datetime |
+--------------+----------+
(caller_id, recipient_id, call_time) is the primary key for this table.
Each row contains information about a call the time of a call between caller_id and recipient_id.
Calls table:
+-----------+--------------+---------------------+
| caller_id | recipient_id | call_time |
+-----------+--------------+---------------------+
| 8 | 4 | 2021-08-24 17:46:07 |
| 4 | 8 | 2021-08-24 19:57:13 |
| 5 | 1 | 2021-08-11 05:28:44 |
| 8 | 3 | 2021-08-17 04:04:15 |
| 11 | 3 | 2021-08-17 13:07:00 |
| 8 | 11 | 2021-08-17 22:22:22 |
+-----------+--------------+---------------------+

Result table:
+---------+
| user_id |
+---------+
| 1 |
| 4 |
| 5 |
| 8 |
+---------+

On 2021-08-24, the first and last call of this day for user 8 was with user 4. User 8 should be included in the answer.
Similary, User 4 had the first and last call on 2021-08-24 with user 8. User 4 should be included in the answer.
On 2021-08-11, user 1 and 5 had a call. The call was the only call for both of them on this day. Since this call is the first and last call of the day for both of them, they both should be included in the answr.

Solution:
Combine caller_id and recipient_id and generate t1. Then generate the rankings as t2. Select the users with rankings equal to 1 and with the same other user.

-- Hard

with t1 as (
select caller_id as id1, recipient_id as id2, call_time from Calls
union all
select recipient_id as id1, caller_id as id2, call_time from Calls
),
t2 as (
select id1, id2, date(call_time) as dt,
rank() over(partition by id1, date(call_time) order by call_time) as rk1,
rank() over(partition by id1, date(call_time) order by call_time desc) as rk2
from t1
)
select distinct id1 user_id from t2
where rk1 = 1 or rk2 = 1
group by dt, id1 having count(distinct id2) = 1;
-- Solution #2   

with calls as(
select Callerid, cast(datecalled as date) as called_date, min(datecalled) as first_call, max(datecalled) as last_call
from phonelog
group by callerid, cast(datecalled as date))

select c.*, p1.recipientid from calls c
inner join phonelog p1 on c.callerid = p1.callerid and c.first_call = p1.datecalled
inner join phonelog p2 on c.callerid = p2.callerid and c.last_call = p2.datecalled
where p1.recipientid=p2.recipientid

Amazon

Merge Overlapping Events in the Same Hall

Description

Table: HallEvents

+-------------+------+
| Column Name | Type |
+-------------+------+
| hall_id | int |
| start_day | date |
| end_day | date |
+-------------+------+
There is no primary key in this table. It may contain duplicates.
Each row of this table indicates the start day and end day of an event and the hall in which the event is held.

Write an SQL query to merge all the overlapping events that are held in the same hall. Two events overlap if they have at least one day in common.

Return the result table in any order.

The query result format is in the following example.

Example 1:

Input: 
HallEvents table:
+---------+------------+------------+
| hall_id | start_day | end_day |
+---------+------------+------------+
| 1 | 2023-01-13 | 2023-01-14 |
| 1 | 2023-01-14 | 2023-01-17 |
| 1 | 2023-01-18 | 2023-01-25 |
| 2 | 2022-12-09 | 2022-12-23 |
| 2 | 2022-12-13 | 2022-12-17 |
| 3 | 2022-12-01 | 2023-01-30 |
+---------+------------+------------+
Output:
+---------+------------+------------+
| hall_id | start_day | end_day |
+---------+------------+------------+
| 1 | 2023-01-13 | 2023-01-17 |
| 1 | 2023-01-18 | 2023-01-25 |
| 2 | 2022-12-09 | 2022-12-23 |
| 3 | 2022-12-01 | 2023-01-30 |
+---------+------------+------------+
Explanation: There are three halls.
Hall 1:
- The two events ["2023-01-13", "2023-01-14"] and ["2023-01-14", "2023-01-17"] overlap. We merge them in one event ["2023-01-13", "2023-01-17"].
- The event ["2023-01-18", "2023-01-25"] does not overlap with any other event, so we leave it as it is.
Hall 2:
- The two events ["2022-12-09", "2022-12-23"] and ["2022-12-13", "2022-12-17"] overlap. We merge them in one event ["2022-12-09", "2022-12-23"].
Hall 3:
- The hall has only one event, so we return it. Note that we only consider the events of each hall separately.

Solution:

with cte as(
select hall_id,start_date,end_date,lag(end_date) over(partition by hall_id order by start_date) as prev_end_date
from hall_events
)
select hall_id,min(start_date) as start_date,max(end_date) as end_date
from cte
where prev_end_date is null or start_date < prev_end_date
group by hall_id
union
select hall_id,start_date,end_date
from cte where start_date>prev_end_date
order by hall_id,start_date;

Breaking down the solution for better understanding

select hall_id,start_date,end_date,lag(end_date) over(partition by hall_id order by start_date) as prev_end_date
from hall_events
with cte as(
select hall_id,start_date,end_date,lag(end_date) over(partition by hall_id order by start_date) as prev_end_date
from hall_events
)
select hall_id,min(start_date) as start_date,max(end_date) as end_date
from cte
where prev_end_date is null or start_date < prev_end_date
group by hall_id

using UNION and combining them all together since we hall id 1 two times which don’t overlap. Final result -

Uber

Description

Write a query to print total rides and profit rides for each driver’s profit ride is when the end location of current ride is same as start location on next ride.

Table 'drivers' 

Rows affected: 7

id | start_time | end_time | start_loc | end_loc
-----------------------------------------------------
dri_1 | 09:00:00 | 09:30:00 | a | b
dri_1 | 09:30:00 | 10:30:00 | b | c
dri_1 | 11:00:00 | 11:30:00 | d | e
dri_1 | 12:00:00 | 12:30:00 | f | g
dri_1 | 13:30:00 | 14:30:00 | c | h
dri_2 | 12:15:00 | 12:30:00 | f | g
dri_2 | 13:30:00 | 14:30:00 | c | h

Solution:

-- Using Lead Window function

SELECT
d1.id AS driver_id,
COUNT(*) AS total_rides,
SUM(IF(d1.end_loc = d2.start_loc, 1, 0)) AS profit_rides
FROM drivers d1
LEFT JOIN drivers d2
ON d1.id = d2.id AND d1.end_time = d2.start_time
GROUP BY d1.id;

-- Using self Join

SELECT
d1.id AS driver_id,
COUNT(*) AS total_rides,
SUM(IF(d1.end_loc = d2.start_loc, 1, 0)) AS profit_rides
FROM drivers d1
LEFT JOIN drivers d2
ON d1.id = d2.id AND d1.end_time = d2.start_time
GROUP BY d1.id;

Output:

For input commands visit this folder- https://drive.google.com/drive/folders/1nD-VPcxus2wMljMogBx4hHt4yUIwkKzI?usp=sharing

Hope you find it helpful! Please follow, like and share.

Happy Learning! ❤️

--

--