A real-world example on using sub-queries in Laravel to optimize performance
What is subquery ?
A subquery, often known as a nested query or sub-SELECT, is a SQL query that resides within another query. This nested structure allows for a dynamic and versatile approach to querying data, where the result of the inner query is used as a part of the outer query’s condition or calculation.
Basic Sql example:
To grasp the practical application of subqueries in SQL, let’s delve into a basic scenario where we aim to retrieve users who have authored more than 5 posts.
SELECT * FROM users WHERE (SELECT COUNT(id) FROM posts WHERE posts.user_id = users.id) > 5;
In this query:
- The outer query selects all columns from the
users
table. - The subquery calculates the count of posts for each user (
COUNT(id) FROM posts WHERE posts.user_id = users.id
). - The WHERE clause filters the users based on the result of the subquery, selecting only those users with more than 5 posts.
Real world example:
In this example, we will explore the application of subqueries in Laravel to streamline and enhance a query operation. Let’s tackle a scenario where we aim to retrieve a list of employees along with their names, join dates, and the amounts of their most recent sales.
Database Schema:
- Employees Table: Contains fields such as
id
,first_name
,last_name
, andjoined_at
to store employee details. - Sales Table: Consists of fields like
id
,amount
,date
, andemployee_id
to record sales transactions, including the associated employee.
EmployeeController:
views/employees/index.blade.php:
And we will get the following queries performed:
Upon closer inspection, it is evident that our current implementation suffers from the (N+1) query problem, leading to suboptimal performance
To mitigate this issue and enhance the efficiency of our data retrieval process, let’s proactively address the (N+1) query concern by eagerly loading the sales
relationship:
EmployeeController:
views/employees/index.blade.php:
And we will get the following queries performed:
While we have successfully addressed the (N+1) query problem, a new concern has emerged. Currently, we are fetching all sales data for each employee solely to display the latest sale amount, resulting in a considerable memory consumption of 15MB.
Considering a scenario where each employee possesses 1000 sales, we are burdening the memory with 10000 sales entries every time a page is loaded. This approach is inefficient and can lead to performance issues.
So, how to solve this issue ?
Solution:
We can use a subquery to retrieve the last sale amount without loading all the employee sales to the memory.
EmployeeController:
Let’s explain the previous query:
Employee::query()
: This initializes a query builder for theEmployee
model.->addSelect([...])
: This method allows you to add additional columns to the query result. In this case, a new column namedlast_sale_amount
is created.'last_sale_amount' => Sale::query()...
: Here, a subquery is used to fetch the last sale amount for each employee.
Sale::query()
: Initializes a query builder for theSale
model.->select('amount')
: Specifies that only theamount
column should be selected from theSale
table.->whereColumn('employee_id', 'employees.id')
: Specifies the relationship between theSale
andEmployee
tables. It compares theemployee_id
column in theSale
table with theid
column in theemployees
table.->latest()
: Orders the sales records in descending order based on their creation timestamp, ensuring that the latest sale comes first.->take(1)
: Limits the result to only the first record, which corresponds to the latest sale amount for each employee.
The raw SQL query generated :
select `employees`.*, (select `amount` from `sales` where `employee_id` = `employees`.`id` order by `created_at` desc limit 1) as `last_sale_amount` from `employees` limit 10 offset 0
views/employees/index.blade.php:
And we end up with the following queries performed:
We have effectively resolved the issue of excessive memory consumption by employing a targeted subquery to retrieve only the necessary data, specifically the last sale amount. This optimization not only eliminates the need to load all sales data into memory but also successfully addresses the (N+1) query problem, enhancing the efficiency of our data retrieval process.
Summary:
Subqueries play a crucial role in optimizing queries when eager loading is not sufficient for a specific task. They allow for more granular control over data retrieval, enabling the selection of specific data points or aggregations without the need to load unnecessary information. In scenarios where eager loading might not be the most efficient solution due to complex data relationships or specific data requirements, subqueries provide a powerful tool to tailor queries to meet the exact needs of the application.
By strategically incorporating subqueries, developers can fine-tune query performance, reduce memory consumption, and enhance the overall efficiency of data retrieval processes.