Understanding the Order of Execution in SQL Queries

Amulya Kulkarni
Data And Beyond
Published in
3 min readJun 5, 2023
Photo by Rubaitul Azad on Unsplash

When working with a relational database, understanding the order in which a SQL query is executed is crucial. It not only helps optimize query performance but also allows developers to comprehend the logical flow of data retrieval and manipulation. In this blog post, we will explore the step-by-step order of execution for a SQL query in a typical relational database management system (RDBMS).

  1. FROM and JOINs: The query execution begins with the FROM clause. In this step, the database system accesses the tables specified in the FROM clause and performs any necessary joins between them. Joins combine related rows from different tables based on the specified join conditions. This step retrieves the initial set of data that will be used for further processing.
SELECT *
FROM table1
JOIN table2 ON table1.id = table2.id;

2. WHERE: Once the tables are accessed and joined, the WHERE clause is applied. The WHERE clause filters the rows from the joined tables based on the specified conditions. It allows you to specify criteria that determine which rows should be included in the result set. Rows that do not meet the conditions are eliminated from further processing.

SELECT *
FROM table1
JOIN table2 ON table1.id = table2.id
WHERE table1.category = 'Books';

3. GROUP BY: If a GROUP BY clause is present in the query, the rows are grouped based on the specified columns. This step aggregates the data and creates groups of rows that share common values in the specified columns. It is commonly used with aggregate functions such as SUM, COUNT, AVG, etc., to perform calculations on the grouped data.

SELECT category, COUNT(*)
FROM table1
GROUP BY category;

4. HAVING: Following the GROUP BY step, the HAVING clause is applied. The HAVING clause filters the groups created in the previous step based on the specified conditions. It allows you to apply conditions to the grouped data. Groups that do not satisfy the conditions specified in the HAVING clause are eliminated from further processing.

SELECT category, COUNT(*)
FROM table1
GROUP BY category
HAVING COUNT(*) > 10;

5. SELECT: Once the data is filtered and grouped, the SELECT clause is applied. The SELECT clause specifies which columns or expressions to include in the result set. It allows you to define the projection of the final result by selecting the desired columns or computing new values using expressions.

SELECT name, price
FROM products;

6. DISTINCT: In some cases, you may want to eliminate duplicate rows from the result set. If the DISTINCT keyword is specified, this step removes duplicate rows, leaving only unique rows in the final result. It is particularly useful when you need to retrieve unique values from a column or combination of columns.

SELECT DISTINCT category
FROM products;

7. ORDER BY: If a specific order is required for the result set, the ORDER BY clause is applied. The ORDER BY clause sorts the result set based on the columns specified. It allows you to arrange the rows in ascending or descending order, based on one or multiple columns. This step determines the order in which the rows will appear in the final result.

SELECT name, price
FROM products
ORDER BY price DESC;

8. LIMIT / OFFSET: In some scenarios, you may need to limit the number of rows returned by a query or retrieve a specific subset of rows. The LIMIT and OFFSET clauses help achieve this. The LIMIT clause specifies the maximum number of rows to include in the result set, while the OFFSET clause determines the starting point from which to retrieve rows. This step allows you to paginate through large result sets or fetch a specific range of rows.

SELECT name, price
FROM products
ORDER BY price DESC
LIMIT 10 OFFSET 5;

Conclusion:

Understanding the order of execution for a SQL query is essential for optimizing query performance and comprehending the logical flow of data retrieval and manipulation. By following the steps outlined in this blog post, you can gain insights into how a query is processed by a relational database management system. Remember that while the general order of execution remains the same, specific database systems may have variations and optimizations to enhance query performance.

If you have any suggestions/feedback, please do not hesitate to reach out to me on LinkedIn or you can also subscribe to my new articles.

Happy Learning!

--

--

Amulya Kulkarni
Data And Beyond

Power BI consultant | Data Science Aspirant | Entrepreneur | Book Reviewer | Blogger