Project Data Analysis for Retail : Sales Performance Report

M. Hamzah
Analytics Vidhya
Published in
7 min readNov 22, 2021

--

Measuring Retail Store Sales Performance with SQL

Source : freepic

Background Project

This undertaking supplied by means of DQLab, right here used to be given the dataset includes the transactions document from 2009 till 2012 consist 5500 rows, which is the order reputation area has the price ‘Order Finished’, ‘Order Returned’ and ‘Order Cancelled’.

What the task given?

Through the information has given, the supervisor of DQLab Store desires to understand:

  1. Overall performance DQLab Strore from 2009 till 2012 for Order numbers and total sales which order status is finished
  2. Overall performance of DQLab through product subcategory to be in contrast between 2011 and 2012
  3. The effectiveness and effectivity of promotions carried out so far, by means of calculating the burn rate of the overall promotions by way of year
  4. The effectiveness and effectivity of promotions carried out so far, via calculating the burn rate of the overall promotions by way of sub-category of product on 2012
  5. Analysis of clients each year
  6. Analysis range of new clients for every year

Dataset Brief

All facts for this mission are saved in the desk known as dqlab_store_sales. You can see the look of the dataset under :

#table information
DESC dqlab_sales_store;

#first 5 records
SELECT *
FROM dqlab_sales_store
LIMIT 5;

Output :

Fig 1 : dqlab_sales_store Table information
Fig 2 : First 5 rows appearance of dqlab_sales_store table

Let’s solving this project !

  1. Overall performance by years
SELECT YEAR(order_date) years,
SUM(sales) sales,
COUNT(order_status) 'number of orders'
FROM dqlab_sales_store
WHERE order_status = 'Order Finished'
GROUP BY 1;

order_date area has a date format, we can get the 12 months phase of date structure via the usage of YEAR() function. Next, we use the SUM() feature to get whole sales, then COUNT() feature to get the quantity of order. Don’t forget about to add WHERE clause to filter order_status simply for finished order. Finally, to get the cost via years, we need to put GROUP BY 1 on the query because 12 months subject is positioned in the first column.

Output :

Fig 3: Number of sales and order

We can see, complete income of DQLab keep are modified over the year. The best whole income had been in 2009 and it doesn’t get greater after that. But distinctive from the variety of order, it goes trip barring in 2011. Although the trade isn’t too widespread over the years.

2. Overall Performance by Sub Category

SELECT YEAR(order_date) as years,
product_sub_category,
sum(sales) as sales
FROM dqlab_sales_store
WHERE YEAR(order_date) BETWEEN 2011 AND 2012 AND order_status = 'Order Finished'
GROUP BY years, product_sub_category
ORDER BY years, sales DESC;

Output :

Fig 4 : Sales By Sub Category on 2011 & 2012
SELECT *,
ROUND((sales2012-sales2011)*100/sales2012, 1) 'sales growth (%)'
FROM(
SELECT product_sub_category,
SUM( IF( YEAR(order_date) = 2011, sales, 0)) sales2011,
SUM( IF( YEAR(order_date) = 2012, sales, 0)) sales2012
FROM dqlab_sales_store
WHERE order_status = 'Order Completed'
GROUPS BY product_sub_category
) sub_category
MESSAGE AFTER 4 DESC;

Here the pivot desk used to be used to evaluate the complete income in 2011 with 2012. We can use SUM() feature observed through IF() feature to do it. SUM() is used to get the complete income and IF() used to filter by means of 12 months that we favor to specify.

Output :

Fig 5 : Compare total Sales by Sub Category on 2011 & 2012 with growth sales

Most the increase income are lead the increases, proven by means of a nice value. But there are some sub-category merchandise that bought a decline in income from 2011 to 2012 which proven with the aid of a bad value. Labels, Copiers & Fax and tables are the classes that received a decline in income the most.

3. Promotion Effectiveness and Efficiency by Years

SELECT YEAR(order_date) years,
SUM(sales) sales,
SUM(discount_value) 'promotion_value',
ROUND( SUM(discount_value)*100/SUM(sales), 2) 'burn_rate_percentage'
FROM dqlab_sales_store
WHERE order_status = 'Order Finished'
GROUP BY 1;

In this project, burn rate evaluation is used to understanding the effectiveness and effectivity of promotions. It does by using evaluating the sum of promoting cost with complete sales. Here DQLab hopes that the burn price doesn’t exceed 4.5%. Burn price calculating is proven beneath :

Fig 6 : Burn Rate Formula

Output :

Fig 7 : Burn Rate by Year

The consequences inform us that burn prices are above 4.5% for every 12 months as overall. This shows that the promotions have been carried out haven’t been in a position to minimize the burn charge to a most of 4.5%. We can parent out what is the product which made a big contribution inflicting the burn price to be greater than anticipated by means of grouping the question by means of every product.

4. Promotion Effectiveness and Efficiency by using Product Sub Category

SELECT YEAR(order_date) years,
product_sub_category,
product_category,
SUM(sales) sales,
SUM(discount_value) promotion_value,
ROUND(SUM(discount_value)*100/SUM(sales),2) 'burn_rate_percentage'
FROM dqlab_sales_store
WHERE YEAR(order_date) = 2012
AND order_status = 'Order Finished'
GROUP BY 3, 2, 1
ORDER BY 4 DESC;

Output :

Fig 8 : Burn Rate by sub-category of product

There are solely 5 sub-category of product that have the burn charge bellow 4.5 %. It proven on the first 5 rows, they are beginning on Rubber Bands to Telephones and Communication. Whereas the Labels have greater 0.02% from the most price of anticipated burn charge by way of DQLab Store.

It is very fascinating due to the fact from these outcomes we be aware of that nevertheless many sub-category of product which have the burn charge greater than 4.5%.

5. Customers Transactions per Year

SELECT YEAR(order_date) years,
COUNT(DISTINCT customer) 'number of customers'
FROM dqlab_sales_store
WHERE order_status = 'Order Finished'
GROUP BY 1;

The calculation of clients variety for every year doesn’t contain the reproduction value. That is why DISTINCT is used in this question for getting the special price of clients number.

Output :

Fig 9 : Number of customers by year

The variety of clients isn’t altering considerably overall. But fortunately, we didn’t get a giant decline in customers. We can see that the quantity of clients tends to be in the values round 580–590.

6. New Customer Over the Years

SELECT YEAR(first_order) years,
COUNT(customer) 'new customer'
FROM(
SELECT customers,
MIN(order_date) first_order
FROM dqlab_sales_store
WHERE order_status = 'Order Finished'
GROUP BY 1) first
GROUP BY 1;

To get the wide variety of new clients for every year, we solely want the information that suggests the first time transaction from every customer. We can get it by means of making use of MIN() feature on the first_order subject then calculate the variety of customers.

Output :

Fig 10 : Number of new customers by year

The boom of new clients for every 12 months is decreasing. It receives excessive in 2012 that solely there 11 new customers. But if we returned on the end result earlier than (fig.9) the variety of clients tends to continue to be overall. This informs us that many preceding clients nonetheless returned to DQLab Store to do the transaction without the new clients are decreasing.

SUMMARY

According to records evaluation that we have done, we can conclude that :

  1. The whole income and order variety of DQLab Store are fluctuated as overall, the place the easiest complete income passed off in 2009. Meanwhile, the quantity of orders accelerated barring in 2011 although the extend is now not too significant.
  2. Total income primarily based on the sub-category of merchandise received growing between 2011 and 2012 as overall. But there some of them is obtained income decreasing, they are Appliances, Bookcases, Tables, Labels, Copiers & Fax.
  3. The burn rate over the year nonetheless above 4.5%. This shows that the effectiveness and effectivity of promotions haven’t been accomplishing as DQLab Store hopes.
  4. There are many merchandise that have a burn price above 4.5%. This reasons the universal burn charge is above 4.5% for every year. The solely merchandise that have a burn charge bellow 4.5% are Rubber Bands, Envelopes, Chairs & Chairmats, Tables and Telephones & Communication.
  5. The range of clients over the years tends to remain, it is around 580–590.
  6. The quantity of new clients lowering over the years, with the lowest variety of new clients is 11 in 2012.

--

--

M. Hamzah
Analytics Vidhya

Bachelor of Computer Science / Informatics | Data Enthusiast | Have an interest in Data Science and Data Analysis