5 Real-World SQL Queries To Know

Paras Patil
CodeX
Published in
4 min readJan 4, 2023

This post addresses some of the common situations that a Data Analyst might face during their routine work. It covers some functions & summarization techniques that you might find useful.

For demonstration purposes I have used SQL Server.

1. Date / Number Series

Objective: To generate date / Number series for a particular range

SQL Concept: Recursive CTEs (Common Table Expressions)

Typically recursive CTE has following structure

WITH expression_name AS
(
Anchor query
UNION ALL
Recursive query
)
SELECT * FROM expression_name

Example: In the below example we are trying to generate a date series starting from ‘2022–01–01’ to ‘2022–12–31’

Expected Output + Query:

Use Case:
• To generate date series for a Date Dimension table.
• To generate number series, Even/Odd series etc.

Note:
• Maximum recursion is set to 100 as a default. Set the Max recursion as per your output requirement.

2. Running Totals

Objective: To calculate running/cumulative totals over a specified column

SQL Function: SUM() Over() — Window Function

Example:
In the below shown table we need to compute the running totals over Sales column.

Running Total (C3) = Current Sales (B3)+ Previous Cumulative Month Sales (C2)

Expected Output + Query :

Use Case:
To calculate cumulative Qty, Cost, Hours etc., based on product, year etc.

Note:
• If the running total is to be calculated by certain group then use
Partition By’ in the Over() Clause.

3. Simple Moving Average

Objective: To calculate a moving average for n records

SQL Function: LAG() OVER() — Window Function

Example: Using the below table we need to calculate moving average of 3 months

Expected Output + Query:

Use Case:
1. To calculate average sales quarterly.
2. Simple Moving Average (SMA) is used in Time series data for smoothing.

4. Pivot Tables

Objective: To Pivot values based on two or more columns

SQL Operator: PIVOT()

Example:
Input Table (Excel):

Expected Output (Excel):

Expected Output + Query:

Use Case:
1. Pivot tables are extremely popular in summarizing data in a concise way.
2. Pivot queries can be wrapped inside stored procedures along with parameters for easy usage by downstream tools like SSRS, Power BI, Tableau etc.

5. Five Number Summary

Objective: A statistical summary displaying Minimum, Maxium, Lower Quartile (Q1), Upper Quartile (Q3) & Median (Q2) of a dataset.

SQL Function: Percentile_Cont() WITHIN GROUP() Over()

Example:
Considering the table from Running Total example i.e. MonthlySales, we need to calculate five number summary over sales column.

Below is a query to calculate the Quartiles over sales column. I have transposed the output of Quartile calculation using UNPIVOT so as to make all values under one column.

Use Case:
1. It provides an overall distribution of dataset and is used to plot Box & whiskers chart.
2. Typically, it is used for outlier detection during Exploratory Data Analysis.

Note:
• I have enclosed quartile calculations under a Stored procedure, so that it can be reused & works fine even if the backend data has been updated.

Hope this was helpful to you in some way. Do let me know if you have some other method to achieve the same result.

Cheers!

--

--