How a SQL statement is executed

Huypc
Goalist Blog
Published in
3 min readAug 31, 2023

As a software engineer, we know that SQL is a query language that allows us to store and process data in a database management system (DBMS). Somehow, we only know the syntax, condition, join, etc. Therefore, in this blog, I would like to share a topic that how a SQL statement is executed in database.

How a SQL statement is executed under the hood

For example, I want to find an employee with id. My SQL statement is something like below.

SELECT * FROM employees WHERE id = 100

We all know that the result is the employee whose id is 100. Then, the database will do the following steps to execute it:

  1. Is the syntax correct?

If the syntax is wrong, it will return a message instantly in your text editor (MySQL Workbench, PG Admin, e.g.). If not, continue to the second step.

2. Is it “meaningful”?

At this step, it will check that the statement has any violations to the schema.

For example, your schema has a table called employees, but your statement is:

SELECT * FROM employee WHERE id = 100

Although the syntax is correct, it still throws error due to the wrong table’s name. In this case, it is not employee. In another case, when you select a column that does not exist in the table, or you use a user that does not have permission on the querying table, that violates the constraints of schema, so it also throws error here.

As you can see, the first 2 steps do a lot of checks, but it does not really cost performance much. Most of database can handle it easily.

3. Is it in Shared pool?

At this step, the database will check if there are any statement like this has been executed before.

For example, in real life, when you travel to somewhere, if you have already known the place before, you can arrive quickly. If not, you will ask people around you if they know the destination. They will show you how to arrive there by instructions.

The same with database, if the statement are executed or stored in Share pool before, it will be executed instantly. This mechanism is soft parsing.

In the opposite direction, the database will use hard parsing to execute the statement. It will evaluate all execution plans which can be happened and choose the smallest cost ones. After the evaluation, it will create execution steps in details to perform the query.

The most cost for a query execution is in step 3, hard parsing.

Hard parsing will take a lot of time because the database considers the statement as string and do lots of comparisons in Shared pool.

A small concept to use variable in database

There is a small case study that can help you understand why we should use variable with the SQL statement.

Case Study: Assume that the table employees has 100000 records. Then you want to find id from 1 to 100000. You will execute the below SQL statement.

SELECT * FROM employees WHERE id = 1;
SELECT * FROM employees WHERE id = 2;
SELECT * FROM employees WHERE id = 3;
.
.
.

SELECT * FROM employees WHERE id = 100000;

Because the where condition is used on PK (indexed) so it will run quickly. However, in practical, it will take a lot of time to finish and return the result. Why? Because each statement is different here.

SELECT * FROM employees WHERE id = 1;

And

SELECT * FROM employees WHERE id = 2;

As I mentioned, the database considers statments as string so 1 is different from 2. That means the database has executed 100000 statements instead of 1. It had to do hard parsing thousand times.

What if we use variable instead?

So, I rewrite the above statement into this.

SELECT * FROM employees WHERE id = :id;

Now I use a variable called id. After the first time, the database needs to do hard parsing only 1 time and the completion time is shorter multiple times.

--

--