Advanced SQL JOIN Techniques: Mastering Complex Data Queries for Marketing Insights

Bohsun Chen
4 min readJun 14, 2024

--

Photo by Ilya Pavlov on Unsplash

After mastering the basics of SQL JOINS. In this section, I’ll explore common advanced JOIN scenarios with the combination of other SQL functions. Let’s strike while the iron is hot!

  1. Combining Multiple Tables with Different JOIN Types:
  • Scenario: combining customer data with sales, product information, and campaigns to analyze sales trend and customer preferences.
SELECT 
customers.customer_name,
products.product_name,
sales.sale_amount,
campaigns.campaign_name
FROM
customers
LEFT JOIN sales ON customers.customer_id = sales.customer_id
INNER JOIN products ON sales.product_id = products.product_id
LEFT JOIN campaigns ON sales.campaign_id = campaigns.campaign_id;
  • Output: it ensures that all customers are included, along with their purchase details and the products they bought, while also including campaign information if available.
| customer_name | product_name | sale_amount | campaign_name |
|---------------|--------------|-------------|---------------|
| Alice | Coffee Maker | 100 | Summer Sale |
| Bob | Blender | 150 | Winter Promo |
| Carol | Toaster | 200 | NULL |
| Dave | NULL | NULL | NULL |

2. Self-JOIN to Find Related Records:

  • Scenario: identifying customers who have referred other customers, including their retferral counts.
SELECT 
referrer.customer_name AS referrer_name,
COUNT(referee.customer_id) AS referral_count
FROM
customers AS referrer
JOIN
customers AS referee ON referrer.customer_id = referee.referred_by
GROUP BY
referrer.customer_name;
  • Output: it performs a self-join on the ‘customers’ table to link each customer to their referees, then counts how many referrals each customer has made.
| referrer_name | referral_count |
|---------------|----------------|
| Alice | 2 |
| Bob | 1 |

3. Full Join to Identify Missing Links:

  • Scenario: combining customer and sales data to identify customers with no sales and sales with no corresponding customer records.
SELECT 
COALESCE(customers.customer_name, 'No Customer') AS customer_name,
COALESCE(sales.sale_amount, 0) AS sale_amount
FROM
customers
FULL JOIN sales ON customers.customer_id = sales.customer_id;
  • Output: it aims to combine data from two tables ‘customers’ and ‘sales’ using a FULL JOIN and handle NULL values using the ‘COALESCE’ function. The goal is to create a comprehensive list of customer names and their corresponding sales amounts, including cases where there might not be a match between tables.
| customer_name | sale_amount |
|---------------|-------------|
| Alice | 50 |
| Alice | 75 |
| Bob | 150 |
| Bob | 175 |
| Carol | 200 |
| Dave | 0 |
| Eve | 0 |
| No Customer | 100 |

4. Optimizing JOIN Queries with Subqueries:

  • Scenario: Finding top-performing products based on total sales amount.
SELECT 
products.product_name,
total_sales.total_amount
FROM
products
INNER JOIN (
SELECT
product_id,
SUM(sale_amount) AS total_amount
FROM
sales
GROUP BY
product_id
) AS total_sales ON products.product_id = total_sales.product_id
ORDER BY
total_sales.total_amount DESC
LIMIT 10;
  • Output: by using the subquery within the JOIN function, it aims to combine two tables ‘products’ and sales, while simultaneously aggregating the sales amount in the subquery to optimize the query performance.
| product_name     | total_amount |
|------------------|--------------|
| Coffee Maker | 450 |
| Blender | 440 |
| Microwave | 360 |
| Toaster | 250 |
| Food Processor | 220 |
| Juicer | 230 |
| Kettle | 200 |
| Air Fryer | 160 |
| Rice Cooker | 170 |
| Mixer | 180 |

5. Using Window Functions with JOINs for Analytics:

  • Scenario: calculating the cumulative sales amount for each customer over time.
SELECT 
customers.customer_name,
sales.sale_date,
sales.sale_amount,
SUM(sales.sale_amount) OVER (PARTITION BY customers.customer_id ORDER BY sales.sale_date) AS cumulative_sales
FROM
customers
JOIN sales ON customers.customer_id = sales.customer_id
ORDER BY
customers.customer_name,
sales.sale_date;
  • Output: it aims to combine two tables “customers” and “sales” with JOIN function to generate a list of customer names, their sale dates, sale amounts, and cumulative sum of their sales over time by window function.
| customer_name | sale_date  | sale_amount | cumulative_sales |
|---------------|------------|-------------|------------------|
| Alice | 2023-01-01 | 50 | 50 |
| Alice | 2023-02-01 | 75 | 125 |
| Bob | 2023-01-15 | 150 | 150 |
| Bob | 2023-01-20 | 200 | 350 |
| Carol | 2023-03-01 | 200 | 200 |
| Dave | 2023-04-01 | 100 | 100 |

6. Using conditional JOIN to analyze different promotional campaigns.

  • Scenario: to analyze customer purchases based on different promotional campaigns.
SELECT 
customers.customer_name,
sales.sale_date,
sales.sale_amount,
campaigns.campaign_name
FROM
customers
JOIN
sales ON customers.customer_id = sales.customer_id
JOIN
campaigns ON sales.sale_date BETWEEN campaigns.start_date AND campaigns.end_date
ORDER BY
customers.customer_name,
sales.sale_date;
  • Output: it combines three tables “customers”, “campaigns”, and “sales” with JOIN funciton, while using conditional JOIN to filter the campaigns table with corresponding start_date and end_date to locate the desired promotional campaigns with its meta data.
| customer_name | sale_date  | sale_amount | campaign_name |
|---------------|------------|-------------|---------------|
| Alice | 2023-01-05 | 50 | Winter Sale |
| Alice | 2023-04-01 | 75 | April Deals |
| Bob | 2023-01-15 | 150 | Winter Sale |
| Bob | 2023-04-05 | 175 | April Deals |
| Dave | 2023-03-10 | 100 | Spring Promo |

Final Thoughts:

Thank you for joining this journey into SQL JOINs. I’ll keep exploring the vast possibilities in related to the toolbox of data analysis. Happy learning!

If you are also interested in revisiting the basics JOINs function, here is the link

References:

  • Javier Canales Luna (2024). 20 Top SQL Joins Interview Questions. [online] Datacamp.com. Available at: https://www.datacamp.com/blog/top-sql-joins-interview-questions [Accessed 14 Jun. 2024].
  • ‌Tanimura, C. (2021). SQL FOR DATA ANALYSIS : advanced techniques for transforming data into insights. S.L.: O’reilly Media.‌

--

--

Bohsun Chen
0 Followers

I am an enthusiastic data scientist and marketing analyst with a diverse background in e-commerce/ city bike system/ hotel business.