PostgreSQL’s LATERAL JOIN
PostgreSQL 9.3 introduced new kind of SQL join type that can be useful for solving problems that needed some more advanced techniques (like builtin procedural language PL/pgSQL) in prior versions.
According to the official documentation:
LATERALkey word can precede a sub-
SELECT FROMitem. This allows the sub-
SELECTto refer to columns of
FROMitems that appear before it in the
LATERAL, each sub-
SELECTis evaluated independently and so cannot cross-reference any other
LATERALcross-references, evaluation proceeds as follows: for each row of the
FROMitem providing the cross-referenced column(s), or set of rows of multiple
FROMitems providing the columns, the
LATERALitem is evaluated using that row or row set’s values of the columns. The resulting row(s) are joined as usual with the rows they were computed from. This is repeated for each row or set of rows from the column source table(s).
This is a pretty fancy definition, but you can think about this like: iterate over each of the results (record) and run subquery giving an access to that record.
Let’s see how to use it in practice.
Let’s say that we have a table with orders that looks like this:
and have data:
Now, we want to know what is the time of the first order created by each user and what is the ID and time of the next order.
LATERAL join is a great solution here.
Solution could be like this:
Let’s analyze each parts.
We want to present
user_id, first_order_time, next_order_time, id (
SELECT user_id, first_order_time, next_order_time, id FROM) taking only first created order by each unique user (
SELECT user_id, min(created_at) AS first_order_time FROM orders GROUP BY user_id) o1). To have an access to that data in the subquery we gave them an alias
After that we use
LEFT JOIN LATERAL which means we iterate over each of the element from
o1 and we are executing subquery.
In the subquery we are selecting ID and time of the next order for each user. We can do this by using a condition
created_at > o1.first_order_time which will use
first_order_time from the first query (
LATERAL join gives us that opportunity).
As a result of running that query, we will see:
User with ID=2 has created only one query so he doesn’t have data for
id. To fix it, we will change
LEFT JOIN LATERAL into
INNER JOIN LATERAL to consider only those rows which have corresponding rows in second table.
After that change our final SQL query looks like this:
Which gives us the results we want:
Want to know first about new articles from that blog?
Subscribe to my newsletter now! — http://eepurl.com/cVPm_v
If you like this article and consider it useful for you, please support it with 💚