SQL, Data Analysis

SuperStore Sales Analysis with SQL

Identifying business growth opportunities by analyzing Superstore sales performance data

Nanda Harahap
6 min readMar 18, 2022
Photo by Jennifer Uppendahl on Unsplash

SQL is one of the programming languages which allows us to access and perform data analysis by using all the available data within our database, and manipulating or filtering it through queries. Thus, we can generate more insightful information out of the raw data.

In this article, we will use the superstore dataset, to analyze sales performance throughout the years and business growth opportunities both from the company and customer perspective. The dataset contains order, Consumer, and Item Categories information.

Task List

  • How are the sales quantity and profit performance throughout the years?
  • Which region has the highest sales?
  • Which city has the highest profit?
  • Which segment and item have generated the most profit?
  • how long is from order to shipping lead time for each shipping option?

Data Preparation & Exploration

Before we jump into data processing or write queries, we have to understand our data structure, such as column name, data type, null value, and others.

Since our data already seems okay, we can proceed to perform data analysis by using all the available data tables.

Data Analysis

  1. How are the sales and profit performance throughout the years?

Sales Quantity and profit columns were used to explore the superstore financial performance. We can analyze the sales quantity and profit data for each year by aggregating (sum) of the total sales and profit.

SELECT 
date_trunc(date(order_date),year) as Order_Date,
sum(quantity) as Total_Quantity,
round(sum(Profit),2) as Total_Profit,
FROM `superstore_order`
group by 1
order by 1 desc

throughout the years (2014–2017), Store sales showed significant growth for Sales Quantity and Total Profit. However, if we take a closer look, sales and profit were so volatile within the Q1 for each year.

Sales Data is explored through GDS for Data Visualization purposes (Month-Year Trend)

2. Which region has the highest sales?

Region data is used for each area’s productivity. Across the regions, West Region has the highest total sales and profit. Indicating numerous possibilities of strategy or adjustment that can be taken and applied to the other regions.

SELECT 
region,
date_trunc(date(order_date),year) as year,
round(sum(sales),2) as Total_Sales,
round(sum(Profit),2) as Total_Profit,
FROM `superstore_order`
group by 1,2
order by 1

Thus, for the business improvement purpose, South Region can be the main focus for area refinement, since the performance is quite concerning due to the slope of sales performance in 2015 and 2017.

West region accumulated the highest sales compared to other regions

3. Which city has the highest profit?

In more detail, we can analyze which city has contributed the most to company revenue. This will give us insight about profitable and not profitable city.

SELECT 
country,
city,
sum(quantity) as Total_Sold,
round(sum(sales),2) as Total_Sales,
round(sum(Profit),2) as Total_Profit,
FROM `superstore_order`
group by 1,2
order by 5 desc
sample of the city’s profit data table

New York, Los Angeles, and Seattle are the top 3 profitable cities. These cities are generating most of the total items sold across the United States, which also significantly contributes to the company profitability.

4. Which segment and item have generated the most profit?

Item and segment can be analyzed to maximize store sales performance. By a total of segment sales, we may boost item sold by adjusting the shelf, stock, and demand variability for each item and city in a more advanced approach.

with Item as 
(select
Ca.Product_ID as Product_ID,
Ca.Category as Category,
Ca.Sub_Category as Sub_Category,
Product_Name,
sum(O.quantity) as Total_Sold,
round(sum(O.Profit),2) as Total_Profit
FROM `superstore_categories` as Ca
left join `superstore_order` as O
on Ca.Product_ID = O.Product_ID
group by 1,2,3,4
order by 6 desc
)
,
Item_Pros as (
select
*,
case
when Total_Profit > 4000 then ‘Profit Maker’
when Total_Profit between 0 and 4000 then ‘Volume Maker’
else ‘Not Profitable’
end as Item_Cat
from Item)
select
*
from Item_Pros
sample of the sales item category

The technology segment is the leading profit-maker among the available segments. Followed by office supplies and furniture. It implied for the profitable region/city might optimize the technology segment to boost the company profitability, while maximizing the allocation of office supplies and furniture stock and sales.

5. how long is from order to shipping lead time for each shipping option?

*assuming lead time is the range of time from order booked until the shipping is made.

From the customer’s perspective, lead time may affect their satisfaction level from the order made until the item shipped and they received the item. If we can streamline the process from order to shipping for the most service options, might increase their satisfaction.

with lead_time as 
(
select
Ship_Mode,
Order_Date,
Ship_Date,
date_diff(ship_date,order_date,Day) as lead_time_delivery
from `superstore_order`
group by 1,2,3
)
,
shipping as
(
select
Ship_Mode,
count(ship_mode) as total_shipping_service,
-- round(avg(lead_time_delivery),2) as avg_lead_time,
lead_time_delivery
from lead_time
group by 1,3
order by 1 desc
)
select
*
from shipping
sample of the ship_mode lead time data table

Standard class is the most shipping option chosen by the customer. The average lead time for the standard class is 5 days. However, the deviation for the lead time is around 3 to 7 days. There is a possibility that the store may improve the process during shipping made, deliver the customer’s item faster.

Conclusion

Through analyzing and utilizing our data, we can improve multiple aspects to improve our business, in this case, the superstore may ameliorate the sales performance by targeting the weak spot/area, opportunistic segment, and streamlining any bottlenecked process. In purpose to improve internal processes and satisfied the customers.

Recommendation

  1. We may focus on amplifying the West Region to boost company profitability, while also improving the South Region performance
  2. Philadelphia, Houston, and San Antonio are not profitable cities. We may consider relocating or redesigning the strategies in operating the store around the areas.
  3. Cubify CubeX 3D Printer Double Head Print, Lexmark MX611dhe Monochrome Laser Printer, and Cubify CubeX 3D Printer Triple Head Print are the not profitable items that might be removed from the item lines. Since the items are contributed to profitability deduction.
  4. For the standard class delivery option, we can improve the processes during the operation and service offers & promos. Since the standard class is so in demand option.

Thank You!

Terima Kasih!!

--

--