Learning Journey in SQL

From A to SQL: How You Can Use Subqueries for More Complex Queries

Practical subquery examples using built-in functions for improved data analysis

Martyna Adam
Learning SQL

--

The “From A to SQL” series is growing! So far, we’ve discussed foundational concepts such as setting up databases and establishing relationships, as well as looking at how joins can advance your data analysis. You can read them in any order for the theory, but if you want to follow the examples at the end, I recommend following each one in sequence. You can access Part 5 here.

Photo by Kari Shea on Unsplash

In this article, I’ll be covering everything about subqueries, providing examples for when you can use it, and how you can combine built-in functions to make your data analysis even more powerful! By mastering this concept, you will be well underway to executing intricate queries, and impressing those who look at your portfolio!

So, what’s a subquery?

In the simplest form, a subquery is just a normal query which is inside another query. It might not sound that impressive but the logic behind it is quite important. Why? Your subquery syntax must be correct, so that it can be executed without errors and provide a result. The use of subqueries is incredibly useful, especially when you want to perform a single operation but need multiple distinct steps, which usually involve using multiple tables.

I’ve heard the term ‘nested query’ and ‘inner query’. What’s the difference?

The terms ‘subquery’, ‘nested query’ and ‘inner query’ are all used interchangeably. Let’s look at an example syntax below to make it clearer:

Example subquery syntax. Image created by author.

The red section of this query is what we refer to as a ‘subquery’ or ‘inner query’. It’s easier to remember because you can find it inside of larger queries. SQL will execute this part first, and the result will be used to run the outer query (the blue section). The subquery should always be wrapped in parentheses and for readability, it’s best practice to indent the subquery.

How else can I make subqueries more powerful?

There are some additional things you can do with your subqueries to allow you to dive deeper into your data and perform enhanced data analysis. Firstly, you can add an alias to your subquery (straight after the last parentheses), and then use the alias in the outer query. Secondly, you can use built-in functions, such as aggregate functions, inside the subquery to locate the max, average, min or sum or a value which can be used in the outer query. Finally, the limit to the levels of nesting you can complete is 32! Although I would advise against having this many levels in a single query, as it will get complicated very quickly, you could write a correct query which has 3–5 nested subqueries.

Are there any disadvantages to using subqueries?

Overall, subqueries are powerful, and can break down complex queries into smaller, logical steps. Due to this, the readability of the code is improved, and code maintenance is easier. However, the server, particularly in MySQL is optimised for using joins over subqueries. Therefore, in certain instances it may be faster to perform a subquery if it is rewritten as a JOIN. To learn more about JOINS, you can read my previous article here.

Now that we covered the theory behind subqueries, let’s see how they work with some practical examples.

Ready? Set? SELECT.

As usual, we will use the example of the business database that we have been working on and developing throughout the chapters. You can refer to Part 2 to see how we started off.

Q1: Find the name and price of items located in Paris and sort the result by price, showing the highest price first.

SELECT 
I_Name AS 'Item Name',
Price
FROM item
WHERE I_ID IN
(SELECT I_ID
FROM supply
WHERE location = 'Paris')
ORDER BY Price DESC;
MySQL results for running query #1. Image created by author.

As this is a simple example, we can follow the logic in the individual steps to make it more clear on how we got the result. If we run the subquery on its own, we find that the Item ID’s ‘I3’ and ‘I4’ are in Paris. Now, we can substitute the subquery with ‘I3’ and ‘I4’, and the outer query will select the name and price of items where the item ID is I3 and I4.

Q2. Find the name and price of items not located in Rome, sorted by price (lowest price first), and then sorted by name (alphabetical first).

SELECT
I_Name AS 'Item Name',
Price
FROM item
WHERE I_ID NOT IN
(SELECT I_ID
FROM supply
WHERE Location = 'Rome')
ORDER BY Price ASC, I_Name ASC;
MySQL results for running query #2. Image created by author.

In this example, we don’t have any items which begin using the same name, so the sort by name may not be that clear.

Q3. Find the Role Title for employees whose salary is greater than the average salary of the employees

SELECT 
Role_Title AS 'Role Title',
Salary
FROM employees
WHERE salary >
(SELECT AVG(salary)
FROM employees)
ORDER BY Salary ASC;
MySQL results for running query #3. Image created by author.

Here, we can tell that those employees who are managers or in a senior sales position are paid more than the average salary for all the employees.

To make our examples better, let’s add more data. I prepared an Excel file with some data and saved as a .csv file. I then uploaded this data into the orders table that was already set up through the Import Data Wizard. This is a handy tool when you are working with large amounts of data. Rather than spending a huge amount of time hand coding all the records, you can upload the data, and get straight back into querying. One thing to keep an eye on is that saving a .csv file can result in some data loss. Therefore, try incorporating some data quality checks prior to developing any more queries.

Q4. Show the first three letters of the order month and shipping day and the count of sales for all the sales.

SELECT 
LEFT(sales.order_month, 3) AS 'Order Month',
LEFT(sales.shipping_day, 3) AS 'Shipping Day',
sales.count_sales AS 'Count of Sales'
FROM
(SELECT
Order_Month,
Shipping_Day,
COUNT(Order_NR) AS count_sales
FROM Orders
GROUP BY 2,1) AS sales
GROUP BY 1,2,3
ORDER BY 3 DESC;
MySQL results for running query #4. Image created by author.

Now it’s easier to see which months where most popular for placing orders as well as the most frequent shipping days.

Congratulations! You have just expanded your SQL technical toolkit to include subqueries. I encourage you to try out writing your own subqueries and see how you can use various built-in functions to advance your skills. If you get stuck, the best thing to do is start by writing the inner query first, and then moving on to writing the outer query after. It can be tricky writing your own questions to answer using subqueries, so it could also be worth finding some questions online which you can answer to test your knowledge.

Good luck and see you in the next part of the learning journey :)

Go from SELECT * to interview-worthy project. Get our free 5-page guide.

--

--