PostgreSQL’s LATERAL JOIN
PostgreSQL 9.3 introduced new kind of SQL join type that can be useful for solving problems that needed some more advanced techniques (like builtin procedural language PL/pgSQL) in prior versions.
According to the official documentation:
The LATERAL key word can precede a sub-SELECT FROM item. This allows the sub-SELECT to refer to columns of FROM items that appear before it in the FROM list. (Without LATERAL, each sub-SELECT is evaluated independently and so cannot cross-reference any other FROM item.)
…
When a FROM item contains LATERAL cross-references, evaluation proceeds as follows: for each row of the FROM item providing the cross-referenced column(s), or set of rows of multiple FROM items providing the columns, the LATERAL item is evaluated using that row or row set’s values of the columns. The resulting row(s) are joined as usual with the rows they were computed from. This is repeated for each row or set of rows from the column source table(s).
This is a pretty fancy definition, but you can think about this like: iterate over each of the results (record) and run subquery giving an access to that record.
Let’s see how to use it in practice.
Let’s say that we have a table with orders that looks like this:
and have data:
Now, we want to know what is the time of the first order created by each user and what is the ID and time of the next order. LATERAL
join is a great solution here.
Solution could be like this:
Let’s analyze each parts.
We want to present user_id, first_order_time, next_order_time, id
(SELECT user_id, first_order_time, next_order_time, id FROM
) taking only first created order by each unique user (SELECT user_id, min(created_at) AS first_order_time FROM orders GROUP BY user_id) o1
). To have an access to that data in the subquery we gave them an alias o1
.
After that we use LEFT JOIN LATERAL
which means we iterate over each of the element from o1
and we are executing subquery.
In the subquery we are selecting ID and time of the next order for each user. We can do this by using a condition created_at > o1.first_order_time
which will use first_order_time
from the first query (LATERAL
join gives us that opportunity).
As a result of running that query, we will see:
User with ID=2 has created only one query so he doesn’t have data for next_order_time
and id
. To fix it, we will change LEFT JOIN LATERAL
into INNER JOIN LATERAL
to consider only those rows which have corresponding rows in second table.
After that change our final SQL query looks like this:
Which gives us the results we want:
Want to know first about new articles from that blog?
Subscribe to my newsletter now! — http://eepurl.com/cVPm_v
If you like this article and consider it useful for you, please support it with 💚