How SQL Engine Works: A Simple Guide

Srinivasan Baskaran
Cloudnloud Tech Community
4 min readDec 22, 2023

SQL (Structured Query Language) is a powerful and popular language for manipulating data in relational databases. SQL allows users to perform various operations on data, such as creating, updating, deleting, querying, and analyzing. But how does SQL engine work behind the scenes? How does it understand and execute the queries written by users? In this article, I will explain the basic steps of how SQL engine works and parses the query and retrieves the data to the user.

Step 1: Lexical Analysis

The first step of SQL engine is to perform lexical analysis on the query. Lexical analysis is the process of breaking down the query into smaller units called tokens. Tokens are the basic elements of a query, such as keywords, identifiers, operators, literals, and symbols.

For example, the query SELECT name, age FROM students WHERE age > 18; can be broken down into the following tokens:

  • SELECT (keyword)
  • name (identifier)
  • , (symbol)
  • age (identifier)
  • FROM (keyword)
  • students (identifier)
  • WHERE (keyword)
  • age (identifier)
  • > (operator)
  • 18 (literal)
  • ; (symbol)

The lexical analysis also checks for any syntax errors in the query, such as missing or extra symbols, invalid keywords, or mismatched quotes. If any syntax errors are found, the SQL engine will report them to the user and stop the execution.

Step 2: Parsing

The second step of SQL engine is to perform parsing on the query. Parsing is the process of analyzing the structure and meaning of the query based on the rules of SQL grammar. Parsing converts the tokens into a hierarchical representation called parse tree. A parse tree shows the relationship between the tokens and how they form different clauses and expressions in the query. For example, the parse tree for the query SELECT name, age FROM students WHERE age > 18; can be shown as:

SELECT
├── name
└── age
FROM
└── students
WHERE
└── age > 18

The parsing also checks for any semantic errors in the query, such as invalid identifiers, incompatible data types, or undefined functions. If any semantic errors are found, the SQL engine will report them to the user and stop the execution.

Step 3: Optimization

The third step of SQL engine is to perform optimization on the query. Optimization is the process of finding the best way to execute the query based on the available resources and constraints. Optimization involves analyzing the parse tree and applying various techniques to improve the performance and efficiency of the query.

Some of the common optimization techniques are:

  • Query rewriting: This technique involves modifying the query to make it simpler or more efficient without changing its meaning. For example, the query SELECT * FROM students WHERE name LIKE 'A%'; can be rewritten as SELECT * FROM students WHERE name >= 'A' AND name < 'B'; to avoid using the expensive LIKE operator.
  • Query plan generation: This technique involves creating a query plan that specifies the order and method of executing the query. A query plan consists of a series of steps or operations that are performed on the data, such as scanning, filtering, sorting, joining, aggregating, and projecting. For example, the query plan for the query SELECT name, age FROM students WHERE age > 18; can be shown as:
PROJECT name, age
└── FILTER age > 18
└── SCAN students
  • Query plan selection: This technique involves choosing the best query plan among the possible alternatives based on the estimated cost and benefit of each plan. The cost and benefit of a query plan depend on various factors, such as the size and distribution of the data, the indexes and statistics of the tables, the available memory and disk space, and the concurrency and workload of the system. The SQL engine uses a cost-based optimizer (CBO) to compare and select the best query plan.

Step 4: Execution

The fourth and final step of SQL engine is to perform execution on the query. Execution is the process of carrying out the query plan and producing the result set for the user. Execution involves accessing the data from the physical storage, applying the operations specified in the query plan, and returning the output to the user.

For example, the execution for the query SELECT name, age FROM students WHERE age > 18; can be shown as:

  • SCAN students: The SQL engine reads the data from the students table stored on the disk or in the memory.
  • FILTER age > 18: The SQL engine applies the condition age > 18 to each row of the students table and discards the rows that do not satisfy the condition.
  • PROJECT name, age: The SQL engine selects the columns name and age from the filtered rows and forms the result set.
  • RETURN: The SQL engine returns the result set to the user.

Final Take away:

In this article, we have explained the basic steps of how SQL engine works and parses the query and retrieves the data to the user. We have seen that SQL engine performs four main steps: lexical analysis, parsing, optimization, and execution. Each step involves various processes and techniques that ensure the correctness and efficiency of the query. SQL engine is a complex and sophisticated system that handles the data manipulation tasks for the users. By understanding how SQL engine works, we can write better and faster queries and optimize the performance of our database applications.

That’s it, thank you for reading.

Happy to share this article to help you to explore new updates, do follow me in medium.com, click the clap 👏 button below to show your valuable support.

--

--

Srinivasan Baskaran
Cloudnloud Tech Community

Qualified IT professional with over 20+ years of experience in rendering Technical Expertise in Microsoft, Cloud (Azure, AWS) .Net/SQL Development, DevOps, RDBM