Photo by Olav Ahrens Røtne on Unsplash

Cracking the Code: Understanding the Order of Execution in SQL Queries

Lauren Rosenthal
Published in
4 min readMay 18, 2023

--

When you first start learning SQL, you most often begin with learning the statements and clauses in this order:

- SELECT

- FROM

- WHERE

- GROUP BY

- HAVING

- ORDER BY

That’s the way they’re written in our queries, so it makes sense to learn them that way. But, is that the same order in which the query is executed?

Let’s find out!

Diving in

Let’s say we have two tables, employees and department.

Employees table:

The employees table contains 1000 rows with the following columns: employee_id, first_name, last_name, email, age, salary, department_id, and bonus.

Department table:

The department table contains the following columns: department_id and department.

The question:

Using the employees and department tables, we want to find the average salary by department.

We can do this with the following query:

The result:

The above query returns this result:

But what’s going on behind the scenes? How is this query actually executed? Here’s a sneak peak:

The order of execution

1. FROM and JOINs

The FROM clause and any JOINs are executed in order to determine the tables involved in the query and form the source for retrieving the data.

In the example query above, this means it’s taking the entire employees table and joining the department table if there is a corresponding department_id.

2. WHERE

The WHERE clause filters the rows based on the specified constraints. Rows that don’t meet the constraints are removed from the set of data.

In the example query, we filter out any row from the employees table that contains a person with an age greater than 30.

3. GROUP BY

The GROUP BY clause is executed on the rows that remain after the WHERE clause has filtered our original result set.

In the example query, our results get grouped by department. That means there are only as many rows as there are unique department values.

4. HAVING

The HAVING clause filters out the grouped rows that don’t meet the criteria.

In the example query, the HAVING clause filters out any groups of records that contain fewer than 5 employees.

5. SELECT

The SELECT clause is executed to determine the columns to be selected and retrieved from the tables.

In the example query, the department column and the salary column are selected, with the salary aggregated to return the rounded average salary per department.

6. ORDER BY

The ORDER BY clause is executed to sort the results in a specified order.

In the example query, the results are returned from highest average salary to lowest average salary.

7. LIMIT

If present, the LIMIT clause would specify the number of rows returned and discard the rest.

Why does it matter?

You may be thinking at this point, “okay, great, but why does the order of execution matter?” There are a few reasons:

  1. Query optimization: by understanding the order in which your query executes, you can optimize your queries for better performance.
  2. Expected results: understanding the order of execution can help you to get the desired results by applying appropriate filters, aggregations, and conditions in the correct order.
  3. Troubleshooting: understanding the order of execution helps in identifying where data is not being filtered correctly, grouped appropriately, or sorted as expected.
  4. Developing complex queries: understanding the order of execution helps you organize and structure your queries correctly, ensuring the logic flows smoothly and producing the desired output.

An example:

If we wanted to return the name and total compensation of any employee who made over $100k/year, we might write a query like this:

In MySQL, we get the following error message: Error Code: 1054. Unknown column ‘total_comp’ in ‘where clause’. Can you figure out why?

Based on the order of execution, the WHERE clause is executed before the SELECT statement, so the query doesn’t recognize the alias ‘total_comp’. Instead, we could do this:

And as we would expect, we return all of the employees who meet our criteria.

Success!

Conclusion

As you progress in your journey of learning and using SQL, being able to optimize, debug, and troubleshoot your queries will be of the utmost importance. Understanding how the order of execution occurs will help you do just that!

Photo by Rahul Dey on Unsplash

If you liked this, we’d love for you to subscribe to our publication.

And drop us a comment letting us know what you’d like to see next. You can also submit your own writing for us to promote, so don’t be shy if you would like to become a contributor.

Happy learning!

--

--

Lauren Rosenthal
Learning Data

I'm an Account Executive, Learning Guide, and Data Analyst at Maven Analytics. I love sharing my own journey and tips and tricks I picked up along the way.