FAANG’s SQL Interview Questions

Ace the SQL Round in Data Engineering Interviews

Nnamdi Samuel
Art of Data Engineering
4 min readJan 10, 2024

--

Photo by Microsoft Edge on Unsplash

The SQL round in data engineering interviews is often considered challenging. Mastery in this area can be achieved through consistent and thorough practice. Below, you’ll find some SQL interview problems accompanied by solutions and detailed explanations to help you prepare effectively.

Question 1

The LinkedIn creation team is looking for power creators who use their personal profiles as a company or influencer page. If someone’s LinkedIn has more followers than the company they work for, we can assume that person is a power creator. Write a query to return the IDs of these LinkedIn page creators.

Assumption: A person can work at multiple companies.

Tables creation

-- Table one
CREATE TABLE profile_table (
Profile_id INT,
Name VARCHAR(255),
Followers INT
);

INSERT INTO profile_table (Profile_id, Name, Followers)
VALUES
(1, 'Nick Singh', 92000),
(2, 'Zach Wilson', 199000),
(3, 'Daliana Liu', 171000),
(4, 'Ravit Jain', 107000),
(5, 'Vin Vashishta', 139000),
(6, 'Susan Wojcicki', 39000);

-- Table two
CREATE TABLE personal_profile (
Personal_profile_id INT,
company_id INT
);

INSERT INTO personal_profile (Personal_profile_id, company_id)
VALUES
(1, 4),
(1, 9),
(2, 2),
(3, 1),
(4, 3),
(5, 6),
(6, 5);

-- Table three
CREATE TABLE company_table (
company_id INT,
Name VARCHAR(255),
Followers INT
);

INSERT INTO company_table (company_id, Name, Followers)
VALUES
(1, 'The Data Science Podcast', 8000),
(2, 'Airbnb', 700000),
(3, 'The Ravit Show', 6000),
(4, 'Datalemur', 200),
(5, 'Youtube', 16000000),
(6, 'DataScience Vin', 4500);

Displaying all the tables

Screenshot by author
Screenshot by author
Screenshot by author

Solution

SELECT
p.name AS profile_table_name,
p.followers AS profile_table_followers,
c.max_followers AS max_company_followers
FROM
profile_table p
JOIN (
SELECT
pp.personal_profile_id,
MAX(c.followers) AS max_followers
FROM
personal_profile pp
INNER JOIN
company_table c ON pp.company_id = c.company_id
GROUP BY
pp.personal_profile_id
) c ON p.profile_id = c.personal_profile_id
WHERE
p.profile_table_followers > c.max_company_followers;

Explanations

SELECT
pp.personal_profile_id,
MAX(cp.followers) AS max_followers
FROM
personal_profile pp
INNER JOIN
company_table c ON pp.company_id = c.company_id
GROUP BY
pp.personal_profile_id;

This subquery identifies, for each personal_profile_id, the maximum number of followers among the associated companies.


SELECT
p.name AS profile_table_name,
p.followers AS profile_table_followers,
c.max_followers AS max_company_followers
FROM
personal_profile pp
JOIN (

--Subquery goes here
) c ON p.profile_id = c.personal_profile_id
WHERE
p.followers > c.max_followers;

In the main query, I joined the personal_profiles table (pp) with the results of the subquery using the profile_id. The output includes the name and followers from the personal_profiles table, along with the max_followers from the subquery, but only for rows where the personal profile has more followers than the maximum company followers.

Output

Screenshot by author

Question 2

Find the total number of clocked hours by each employee (inside the office), where 1 means punch in and 0 means punch out for the “Flag” column. Employees can do multiple punch-ins and punch-outs in a day. For each punch-in, there will be a punch-out.

Table creation

CREATE DATABASE faang;
use faang;

CREATE TABLE employee_data (
empd_id INT,
swipe TIME(6),
flag INT
);

INSERT INTO employee_data (empd_id, swipe, flag)
VALUES
(11114, '08:30:00.0000000', 1),
(11114, '10:30:00.0000000', 0),
(11114, '11:30:00.0000000', 1),
(11115, '15:30:00.0000000', 0),
(11115, '09:30:00.0000000', 1),
(11115, '17:30:00.0000000', 0);

Displaying all the columns in the table

Screenshot by author

Solution

SELECT
empd_id,
SUM(clocked_in_hours) AS clocked_in_hours
FROM (
SELECT
empd_id,
DATEDIFF(hour, MIN(swipe), MAX(swipe)) AS clocked_in_hours
FROM (
SELECT
empd_id,
ROW_NUMBER() OVER (PARTITION BY empd_id, flag ORDER BY swipe) AS rn,
swipe
FROM employee_data
) AS ranked
GROUP BY empd_id, rn
) AS aggregated
GROUP BY empd_id;

Explanations

SELECT
empd_id,
ROW_NUMBER() OVER (PARTITION BY empd_id, flag ORDER BY swipe) AS rn,
swipe
FROM employee_data

This is the innermost subquery. It assigns a row number (‘rn’) to each record within each partition defined by ‘empd_id’ and ‘flag’, ordered by the 'swipe' column.

SELECT
empd_id,
DATEDIFF(hour, MIN(swipe), MAX(swipe)) AS clocked_in_hours
FROM (
-- Innermost subquery goes here
) AS ranked
GROUP BY empd_id, rn

The middle subquery takes the result of the innermost subquery (aliased as ‘ranked’) and calculates the difference in hours (‘DATEDIFF’) between the minimum (‘MIN’) and maximum (‘MAX’) 'swipe' values for each ‘empd_id’ and ‘rn’ group. This represents the hours worked during a particular session.

SELECT
empd_id,
SUM(clocked_in_hours) AS clocked_in_hours
FROM (
-- Middle subquery goes here
) AS aggregated
GROUP BY empd_id;

This is the outermost subquery that takes the result of the middle subquery (aliased as ‘aggregated’) and calculates the total sum of ‘clocked_in_hours’ for each ‘empd_id’. This represents the total hours worked by each employee.

Output

Screenshot by author

These problems are sourced from Datalemur and should serve as a guide for preparation for SQL interviews. The first problem was carried out in PostgreSQL, while the latter was carried out in Microsoft SQL Server.

Thank you for reading! If you found this interesting, follow me and subscribe to my latest articles. Catch me on LinkedIn and follow me on Twitter

--

--

Nnamdi Samuel
Art of Data Engineering

Data Engineer💥Voracious Reader and a Writer || Chemical Engineer