The Look E-commerce Data Analysis

Chisom Nnamani
7 min readSep 8, 2023

Unlocking E-commerce Insights: Exploring Data in BigQuery with SQL

In the world of data analytics, the ability to extract valuable insights from vast datasets is a superpower. And one of the data analysis tools that could give you that superpower is SQL.

Photo by macrovector on Freepik.com

Overview

In this project, I utilized SQL to analyze the e-commerce data of TheLook. TheLook is an e-commerce clothing store developed by the Google Looker team. This E-commerce data is hosted on Google Big Query and contains information about customers, products, orders, logistics, web events, and digital marketing campaigns. The contents of this dataset are provided to industry practitioners for the purpose of product discovery, testing, and evaluation.

At the time of this analysis, the e-commerce store recorded sales transaction data from August 2019 to September 2023. The entire dataset that can be accessed here currently consists of 7 tables, but in this analysis, we will only be working with the following tables:

✨ Order_items

✨ Orders

✨ Products

✨ Users

Side note: Google BigQuery has several public datasets that are updated periodically and can be used to build projects for your portfolio.

The Task

The purpose of this analysis is to understand and answer some business questions regarding the performance of the Look e-commerce marketplace, gain insights, and provide some recommendations to increase revenue.

We will be focusing on orders that were not canceled or returned since we will be dealing with metrics related to revenue.

Answering the Business Questions

1. How much are we selling monthly? Is it high or low compared to last month?

Solution:

  • Use the EXTRACT function to get the unique month numbers.
  • Get the revenue by multiplying the sale price with the number of items sold.
  • Use the DISTINCT function to get the unique orders placed and the user IDs who made a purchase.
Image by Author
Monthly Sales — Image by Author

Insights:

  • Revenue has increased steadily
  • Total orders and the total number of customers also increased.
  • The total number of customers for each month was greater than the total number of orders placed. This means that customers were placing more than one order at a time.

2. Who are our customers? Which country do we have major customers coming from? Which Gender and Age group brought in the most profit?

Solution:

a. Customers by Country

  • Use a CTE to get the distinct user ID, country, and total number of users or customers who were male or female.
  • From the CTE, Select the country, the count of distinct customer IDs, and the count of customers who were male or female.
Image by Author
Customers by Country — Image by Author

Insights:

  • The majority of our customers come from China, the United States, and Brasil.
  • The proportion of revenue generated by each gender is almost the same in each country.

b. Customers by Gender

  • Select the gender, and calculate the sum of revenue and total quantity of orders placed by each gender.
Image by Author
Customers by Gender — Image by Author

Insights:

Male customers generated more revenue and bought more products in quantity than female customers.

c. Customers by Age Group

  • Use the case statement to group the customers into different age groups.
  • Count the distinct number of customers in each of the age groups.
Image by Author
Customers by Age Group — Image by Author

Insights:

  • We do not have customers who are kids.
  • There is a relatively low number of customers in the Young adult and teenage groups.

3. What brands and product categories are we selling more and the least? What are we making money on?

a. brand sales

Image by Author
Most Profitable Brands — Image by Author
Least Profitable Brands Image by Author

Insights:

  • Diesel brought in the most revenue.
  • Allegra K had the most sold quantity, double the amount sold by Calvin Klein.
  • Marshal Brand brought in the least revenue.
  • We have a lot of brands that sell only 1, 2, or 3 product quantities.

b. product category sales

  • Select the product category, and calculate the sum of the revenue and quantity of products generated by each brand.
Image by Author
Most Profitable Product Category — Image by Author
Least Profitable Product Category — Image by Author

Insights:

  • The most profitable product category is Outwear & Coats, followed by Jeans and Sweaters.
  • The least profitable product categories are clothing sets, Jumpsuits & Rompers.

4. What are the most canceled and returned brands and product categories?

a. brand cancellation and return

  • Select each brand and use conditional aggregation to count the number of rows where the status of the brand is Cancelled or Returned.
Image by Author
Most Canceled Brands — Image by Author
Most Returned Brands — Image by Author

Insights:

  • The most canceled brands are also the most returned brands.
  • Calvin Klein and Carhartt, which are the second and third most canceled and returned brands, are also two of the most purchased and profitable brands.

b. Product category cancellation and return

  • Select each product category and use conditional aggregation to count the number of rows where the status of the product category is canceled or returned.
Image by Author
Most Canceled Product Category — Image by Author
Most Returned Product Category — Image by Author

Insights:

  • The most canceled categories are also the most returned categories.
  • Jeans and Fashion Hoodies & Sweatshirts which are the second and third most canceled and returned categories, are also two of the most purchased or profitable categories.

5. What marketing channel are we doing well on?

  • Select the traffic source and count the total number of customers that made purchases via each traffic source.
Image by Author
Marketing Channel Sales — Image by Author

Insights:

  • The channels that brought in the most customers were Search, followed by Organic, and Facebook.

6. We will provide promotions during Chinese New Year celebrations for female customers in China via email.

Please provide data on how many customers we should promote each email!

  • Select the ID and email of the customers.
  • Use the WHERE clause to filter the data for females living in China.
Image by Author
Sample Result of Female Customers in China — Image by Author

Insight:

  • We have 16,728 female customers who are in China.

7. Provide a list of 10 customer IDs and emails with the largest total overall purchase. We will give a discount for Campaign 3.3!

  • Select the customer ID and email
  • Use the Sum function to calculate the total purchase value for each customer.
  • Use the Limit function to limit the result to only the top 10 high-value customers.
Image by Author
Top 10 High-Value Customers — Image by Author

8. Create a query to get frequencies, average order value, and the total number of unique users where status is completed grouped by month (Skillset: Intermediate SQL)

  • Use the FORMAT_DATE function to format the created_at date column into a “Year-Month” format and assign it the alias month_year.
  • Calculate the ratio of the count of unique order_id to the count of unique user_id, round the result to two decimal places, and assign it the alias frequencies.
  • Calculate the average order value by dividing the sum of sale_price by the count of unique order_id, rounding the result to two decimal places, and assigning it the alias Average_order_value.
Image by Author
Monthly Frequencies, AOV, and Total Unique Users — Image by Author

Recommendations:

  • Attract New Customers with First Purchase Discounts: Implement introductory discounts for new customers, applied to their initial purchase. This strategy aims to boost customer retention rates.
  • Enhance Sales through Exclusive Offers for Regular Clients: Introduce enticing deals exclusively tailored for our loyal clientele. This approach seeks to drive higher sales volume from our repeat customers.
  • Curate Trendy Products for Younger Demographics: Expand our product range to include items that align with the preferences of children and teenagers. Incorporating globally or locally trending products can enhance their appeal and drive sales.
  • Extend Reach through Diverse Advertising Channels: Utilize alternate platforms like Instagram, TikTok, and popular local social media networks to run strategic advertisements. This tactic aims to foster greater engagement and brand awareness.
  • Forge Franchise Partnerships with High-Performing Brands: Establish collaboration agreements with brands that demonstrate exceptional sales and revenue performance. By fostering these partnerships, we can secure better terms, amplify profitability, and stimulate additional promotional endeavors to bolster sales.

And this comes to the end of my analysis of the TheLook E-commerce data on Google Big Query.

Knowing how to analyze data using cloud storage tools like BigQuery is a very important data analysis skill that every enthusiastic data professional should strive to acquire.

I hope you enjoyed me walking you through this project.

Connect with me on LinkedIn 🚀 and see how I have beautifully documented this project on my GitHub Repo here! ✌🏽

--

--

Chisom Nnamani

Data & Analytics Engineer | Passionate about Data | Connect with me on Linkedin — https://www.linkedin.com/in/chisom-nnamani/