8 Weeks SQL Challenge Case Study #3 — Foodie Fi

Orkun Aran
4 min readFeb 28, 2023

Hi everyone. I hope everyone is safe and healthy.

After a long break, I returned to writing again (my son is 18 months old now and now I can find relatively spare time to create content; also got my associate degree in the academy — so I can ease my schedule).

Let’s dive into Challenge 3 of the program which involves analyzing data from a fictional online marketplace to help its CEO make informed decisions. For previous solutions please see;

Challenge I

Challenge II — Part I and Part II

So let’s go for Challenge III — please be informed I won’t copy Danny’s introduction and challenge description. They can be found at this link.

Problems

Based off the 8 sample customers provided in the sample from the subscriptions table, write a brief description about each customer’s onboarding journey.

-- Brief insights on customers
-- Selected customers (1, 2, 11, 13, 15, 16, 18, 19)
WITH plannings AS
(SELECT s.customer_id, s.plan_id, s.start_date,
DENSE_RANK() OVER (PARTITION BY s.customer_id
ORDER BY s.plan_id) as plan_changes
FROM subscriptions s
JOIN plans p ON p.plan_id = s.plan_id)
SELECT * FROM plannings WHERE customer_id IN (1, 2, 11, 13, 15, 16, 18, 19)
ORDER BY customer_id, start_date;

1. All customers started with a 7-day free trial and upgraded their subscription plan.
2. 6 customers’ subscriptions are ongoing, but customers 11 and 15 ended their subscriptions.
3. Customer 11 didn’t subscribe to a paid membership
4. Vast majority of the subscribed customers upgraded their plan after a while

To avoid “joins” each time I created a new table;

DROP TABLE IF EXISTS foodie_fi_tables;
CREATE TABLE foodie_fi_tables
SELECT s.customer_id, s.plan_id, p.plan_name, p.price, s.start_date
FROM subscriptions s
JOIN plans p ON p.plan_id = s.plan_id;

How many customers has Foodie-Fi ever had?

SELECT DISTINCT COUNT(customer_id) as customers
FROM foodie_fi_tables;

What is the monthly distribution of trial plan start_date values for our dataset use the start of the month as the group by value

SELECT MONTH(start_date)
,COUNT(DISTINCT customer_id) as "trial_distribution"
FROM foodie_fi_tables
WHERE plan_id=0
GROUP BY MONTH(start_date)
ORDER BY MONTH(start_date);

What plan start_date values occur after the year 2020 for our dataset? Show the breakdown by count of events for each plan_name

SELECT plan_id
, COUNT(*)
FROM foodie_fi_tables
WHERE YEAR(start_date) >2020
GROUP BY plan_id
ORDER BY plan_id;

What is the customer count and percentage of customers who have churned rounded to 1 decimal place?

SELECT COUNT(DISTINCT customer_id) as churned_customers,
ROUND(COUNT(DISTINCT customer_id) * 100 /
(SELECT COUNT(DISTINCT customer_id)
FROM foodie_fi_tables),1) as churn_percentage
FROM foodie_fi_tables
WHERE plan_id = 4;

How many customers have churned straight after their initial free trial. What percentage is this rounded to the nearest whole number?

WITH plannings AS 
(
SELECT customer_id,
plan_id,
start_date,
DENSE_RANK() OVER (PARTITION BY customer_id
ORDER BY plan_id) as plan_changes
FROM foodie_fi_tables
)
SELECT COUNT(DISTINCT customer_id) as churn_customers,
ROUND(COUNT(DISTINCT customer_id) * 100 /
(SELECT COUNT(DISTINCT customer_id)
FROM foodie_fi_tables)) AS churn_percentage
FROM plannings
WHERE plan_changes = 2 AND plan_id = 4;

What is the number and percentage of customer plans after their initial free trial?

WITH plannings AS 
(
SELECT customer_id,
plan_id,
start_date,
DENSE_RANK() OVER (PARTITION BY customer_id
ORDER BY plan_id) as plan_changes
FROM foodie_fi_tables
)
SELECT plan_id,
COUNT(plan_id) as number_of_customers,
ROUND(COUNT(plan_id) * 100 /
(SELECT COUNT(*)
FROM foodie_fi_tables)) AS plan_percentages
FROM plannings
WHERE plan_id NOT IN (0,4) AND plan_changes = 2
GROUP BY plan_id
ORDER BY plan_id;

What is the customer count and percentage breakdown of all 5 plan_name values at 2020–12–31?

WITH plannings AS 
(
SELECT customer_id,
plan_id,
start_date,
DENSE_RANK() OVER (PARTITION BY customer_id
ORDER BY plan_id) as plan_changes,
plan_name
FROM foodie_fi_tables
)
SELECT COUNT(DISTINCT plan_id) / COUNT(*) * 100 AS percentages,
plan_id
FROM plannings
WHERE plan_id = ANY(SELECT MAX(plan_id) FROM plannings GROUP BY customer_id)
AND start_date < "2020-12-31"
GROUP BY plan_id;

How many customers have upgraded to an annual plan in 2020?

SELECT COUNT(*) AS pro_annual_count 
FROM foodie_fi_tables
WHERE YEAR(start_date) = 2020 AND plan_name = "pro annual";

How many days on average does it take for a customer to an annual plan from the day they join Foodie-Fi?

SELECT 
AVG(DATEDIFF(start_date, prev_start_date)) AS avg_plan_update
FROM
(SELECT
customer_id,
start_date,
plan_name,
(SELECT MAX(start_date)
FROM foodie_fi_tables s2
WHERE s2.customer_id = s1.customer_id
AND s1.start_date > s2.start_date) AS prev_start_date
FROM
foodie_fi_tables s1
) AS sub_query
WHERE plan_name IN ("trial", "pro annual")
AND prev_start_date IS NOT NULL
AND YEAR(start_date) = "2020";

Can you further breakdown this average value into 30 day periods (i.e. 0–30 days, 31–60 days etc)

WITH day_breakdowns AS
(SELECT
(CASE WHEN DATEDIFF(start_date, prev_start_date) < 30
THEN DATEDIFF(start_date, prev_start_date)
END) AS `0-30 days`,
(CASE WHEN DATEDIFF(start_date, prev_start_date)
BETWEEN 31 AND 60 THEN DATEDIFF(start_date, prev_start_date)
END) AS `31-60 days`,
(CASE WHEN DATEDIFF(start_date, prev_start_date) > 60
THEN DATEDIFF(start_date, prev_start_date)
END) AS `>60 days`
FROM
(SELECT customer_id,
start_date,
plan_name,
(SELECT MAX(start_date)
FROM foodie_fi_tables s2
WHERE s2.customer_id = s1.customer_id
AND s1.start_date > s2.start_date) AS prev_start_date
FROM foodie_fi_tables s1
) AS sub_query
WHERE plan_name IN ("trial", "pro annual")
AND prev_start_date IS NOT NULL
AND YEAR(start_date) = "2020")
SELECT AVG(`0-30 days`) AS `0-30 days avg`,
AVG(`31-60 days`) AS `31-60 days avg`,
AVG(`>60 days`) AS `>60 days avg`
FROM day_breakdowns;

How many customers downgraded from a pro monthly to a basic monthly plan in 2020?

SELECT f1.customer_id, 
f1.plan_name,
f1.start_date,
f2.plan_name,
f2.start_date
FROM foodie_fi_tables f1
JOIN foodie_fi_tables f2 ON f1.customer_id = f2.customer_id AND f1.start_date > f2.start_date
WHERE (f1.plan_name = "basic monthly" AND f2.plan_name = "pro monthly") AND YEAR(f2.start_date) = 2020;

That’s parts A and B. I hope that helps.

PS: I haven’t managed to solve creating “payments” table. Still trying to solve that. Solutions for Part D is on the way.

--

--

Orkun Aran

Data Scientist & Analyst | Freelancer | Assistant Professor, Healthcare |