Sql : Leet code Problem Solved : Category: Easy : Write an SQL query to find the average selling price for each product: Problem 4
This is in contniuation of my Let us solve leetcode sql problem(s) together series. Below is the previous one. I encourage you to go this below problem too for more fun.
For anyone who directly land into this article, we will try to resolve this problem in below 4 steps.
Section 1: Talks about the Problem
Section 2: Prepare your data
Section 3: Analyse the Problem and find the most optimized solution
Section 4: Write the sql query for the problem
Let us jump into the world of problem without further delay.
Section 1: Talks about the Problem
— Table: Prices
— + — — — — — — — -+ — — — — -+
— | Column Name | Type |
— + — — — — — — — -+ — — — — -+
— | product_id | int |
— | start_date | date |
— | end_date | date |
— | price | int |
— + — — — — — — — -+ — — — — -+
— (product_id, start_date, end_date) is the primary key for this table.
— Each row of this table indicates the price of the product_id in the period from start_date to end_date.
— For each product_id there will be no two overlapping periods. That means there will be no two intersecting periods for the same product_id.
— Table: UnitsSold
— + — — — — — — — -+ — — — — -+
— | Column Name | Type |
— + — — — — — — — -+ — — — — -+
— | product_id | int |
— | purchase_date | date |
— | units | int |
— + — — — — — — — -+ — — — — -+
— There is no primary key for this table, it may contain duplicates.
— Each row of this table indicates the date, units and product_id of each product sold.
— Write an SQL query to find the average selling price for each product.
— average_price should be rounded to 2 decimal places.
— The query result format is in the following example:
— Prices table:
— + — — — — — — + — — — — — — + — — — — — — + — — — — +
— | product_id | start_date | end_date | price |
— + — — — — — — + — — — — — — + — — — — — — + — — — — +
— | 1 | 2019–02–17 | 2019–02–28 | 5 |
— | 1 | 2019–03–01 | 2019–03–22 | 20 |
— | 2 | 2019–02–01 | 2019–02–20 | 15 |
— | 2 | 2019–02–21 | 2019–03–31 | 30 |
— + — — — — — — + — — — — — — + — — — — — — + — — — — +
— UnitsSold table:
— + — — — — — — + — — — — — — — -+ — — — -+
— | product_id | purchase_date | units |
— + — — — — — — + — — — — — — — -+ — — — -+
— | 1 | 2019–02–25 | 100 |
— | 1 | 2019–03–01 | 15 |
— | 2 | 2019–02–10 | 200 |
— | 2 | 2019–03–22 | 30 |
— + — — — — — — + — — — — — — — -+ — — — -+
— Result table:
— + — — — — — — + — — — — — — — -+
— | product_id | average_price |
— + — — — — — — + — — — — — — — -+
— | 1 | 6.96 |
— | 2 | 16.96 |
— + — — — — — — + — — — — — — — -+
— Average selling price = Total Price of Product / Number of products sold.
— Average selling price for product 1 = ((100 * 5) + (15 * 20)) / 115 = 6.96
— Average selling price for product 2 = ((200 * 15) + (30 * 30)) / 230 = 16.96
Section 2: Prepare your data
Now we understood the problem. Let us create the tables and insert the data.
For all my data engineer friends I am sharing the ddl and dml so that you can practice problems without wasting your precious time for writing mundane sql scripts. You can have more fun now to run the sql query.
Create Table Script
create table prices
(
product_id int,
start_date date,
end_date date,
price int
)
insert table script
insert into prices
values
(1,’2019–02–17',’2019–02–28',5),
(1,’2019–03–01',’2019–03–22',20),
(2,’2019–02–01',’2019–02–20',15),
(2,’2019–02–21',’2019–03–31',30)
create table script
create table UnitsSold
(
product_id int,
purchase_date date,
units int
)
insert table script
insert into UnitsSold
values
(1,’2019–02–25',100),
(1,’2019–03–01',15),
(2,’2019–02–10',200),
(2,’2019–03–22',30)
Section 3: Analyse the Problem and find the most optimized solution
Now we are ready with our data, and we understood the problem. Now let us begin the steps which we love to do as data engineer!!
Analysis of data and break the problem and resolve it with our special ability to fetch the data wherever it is however it is we Data Engineers every time make it available without fail. Sorry for being bit dramatic sometime, but we need a bit of drama everywhere to make it big!! We really cannot deny it!!
Let's look at the data together -
Now let us the problem in steps and resolve it-
Ask is we need to get the average selling price for each product?
- First, we need to join 2 tables using a common key which is nothing but product_id.
- Now for each product_id we need to pick up the product sold date and find the correct price of the product at that time. This is required as the price of the product is a changing variable over time.
- Then we need to put a sum on the price * product data as there are different number of items have been sold at different date.
- At final step we divide this summation value by number of items.
Hope I am able to explain the logic I will be using behind the query to solve the problem.
Section 4: Write the sql query for the problem
I am sure many of us have already written the query for it. But just for the interest of many new members in data engineering community I am sharing the sql query also as below -
select p.product_id,
round(cast(sum(p.price*us.units) as decimal)/sum(us.units),2)
from prices as p, UnitsSold as us
where us.purchase_date between p.start_date and p.end_date
and p.product_id = us.product_id
group by p.product_id
As I have used postgresql so I need to cast the data at decimal point. But many of you if you are using other databases if not required, please get rid oof that part of the query.
Please refer to below github link for source code-
sanjitkhasnobis/sql_leetcode_solved (github.com)
Happy Coding!! Happy Reading!!