SQL Data Challenge on DVD rental dataset
Case1:
Your are working for a company that rents out DVD movies. During an event they want to put up posters featuring their customers’ most popular actor/actress. You have been asked to find the name of the actor/actress that features most often in movies the customers rent, along with the number of views. We assume that one rental equals one view.
Requirements
The query should output the following columns:
- first_name
- last_name
- number_of_views
and has the following requirements:
- only returns the 1 top actor, ordered by total number of views
My Query
With ActorViews as (
SELECT actor.first_name, actor.last_name, count(rental.rental_id)
,row_number() over(order by COUNT(actor.actor_id) DESC )
FROM customer
JOIN rental ON customer.customer_id = rental.customer_id
JOIN inventory ON rental.inventory_id = inventory.inventory_id
JOIN film_actor ON inventory.film_id = film_actor.film_id
JOIN actor ON actor.actor_id = film_actor.actor_id
group by actor.actor_id,actor.first_name,actor.last_name
)
Select ActorViews.first_name as first_name,
ActorViews.last_name as last_name,
ActorViews.count as number_of_views from ActorViews
where row_number = 1
ORDER BY ActorViews.count DESC;
Case2:
Most customers return their rentals in time, but some take longer to return the movies. The store is considering to ban customers that fail to return a film within the allowed 8 full days. This means that customers that returns a film after 8.5 days are fine and can continue renting more movies, but a customer that still hasn’t returned a movie when 9 whole days have passed from the time of rental will not be allowed to rent a movie again.
Before the chain makes this decision, they want to assess what effect it might have on the business by looking at their rental history. How many rentals would they have missed out on if that rule had been implemented from the start?
A co-worker made a draft for a query, but didn’t have time to finish it. If you want, you can continue on their code, or you can start over. It’s up to you.
Requirements
The query should output the following columns:
- number_of_lost_rentals
and has the following requirements:
- once a customer has had a movie for 9 or more days without returning it, that customer should not be allowed to rent any more movies
- count the number of rentals that wouldn’t have happened if we had followed the rule abov
My query
with rental_history as (
select
customer_id,
rental_id
,rental_date
,return_date
,rental_date + interval ‘9 day’ as ban_date
,coalesce(return_date, now())-rental_date as days_out
from rental
),
GetRentalBanDate as
(
select *,
case when return_date >= ban_date then ban_date end as Rentalbandate
from rental_history
order by customer_id,rental_id
),
CustomerFirstBanDate as
(
select GetRentalBanDate.customer_id,min(GetRentalBanDate.Rentalbandate) as FirstBanDate
from GetRentalBanDate
group by GetRentalBanDate.customer_id
)
select count(rental_id) as number_of_lost_rentals from GetRentalBanDate join CustomerFirstBanDate
using(customer_id) where GetRentalBanDate.rental_date>=(CustomerFirstBanDate.FirstBanDate)
Case3:
The DVD rental chain has several stores, and each store has multiple copies of most movie titles. The chain is looking at buying more copies of the most popular titles. You are asked to find the titles that have been completely “sold out” at least 5 times in the same store. How many of them are there?
A co-worker made a draft for a query, but didn’t have time to finish it. If you want, you can continue on their code, or you can start over. It’s up to you.
Requirements
The query should output the following columns:
- number_of_titles
and has the following requirements:
- count the titles that have been completely out in the same store on 5 separate occasions or more
/* Get the rental tracker for all films by store based on rental and returndates*/
with Rental_Tracker as (
select
store_id,
film_id,
rental_date as event_date,
‘renting’ as event_type,
inventory_id,
1 as number_of_rented_films
from rental
inner join inventory using (inventory_id)
union all
select
store_id,
film_id,
return_date as event_date,
‘returning’ as event_type,
inventory_id,
-1 as number_of_rented_films
from rental
inner join inventory using(inventory_id)
),
/* Get total inventory based on store for each film */
Inventory_Cnt as (
select store_id, Film_Id, count(inventory_id) as invcnt
from inventory RE
group by store_id, Film_Id
),
/* Get cumulative sum of number_of_rented_films based on event_date for each film by store */
Cumulative_Sum as (
select *,
SUM(Rental_Tracker.number_of_rented_films)
OVER (partition by store_id,Film_Id order by Rental_Tracker.event_date rows unbounded preceding) RunningTotal
from Rental_Tracker
),
/* Get all Sold_Out instances for each film by Store, if the cumulative sum is equal to inventory count then it is a sold_out instance */
Check_SoldOut as (
select Cumulative_Sum.*
from Cumulative_Sum join Inventory_Cnt using (store_id, film_id)
where Cumulative_Sum.RunningTotal = Inventory_Cnt.invcnt
),
/*Get Films by store where there are atleast 5 number of sold_out instances */
Five_SoldOuts as (
select Check_SoldOut.film_id
from Check_SoldOut
group by store_id, film_id
having count(RunningTotal) >= 5
)
/*Get count of Film_ids which have atleast five sol_out instances*/
select count(film_id) as number_of_titles from Five_SoldOuts