PostgreSQL’s LATERAL JOIN

PostgreSQL 9.3 introduced new kind of SQL join type that can be useful for solving problems that needed some more advanced techniques (like builtin procedural language PL/pgSQL) in prior versions.

According to the official documentation:

The LATERAL key word can precede a sub-SELECT FROM item. This allows the sub-SELECT to refer to columns of FROM items that appear before it in the FROM list. (Without LATERAL, each sub-SELECT is evaluated independently and so cannot cross-reference any other FROM item.)

When a FROM item contains LATERAL cross-references, evaluation proceeds as follows: for each row of the FROM item providing the cross-referenced column(s), or set of rows of multiple FROM items providing the columns, the LATERAL item is evaluated using that row or row set’s values of the columns. The resulting row(s) are joined as usual with the rows they were computed from. This is repeated for each row or set of rows from the column source table(s).

This is a pretty fancy definition, but you can think about this like: iterate over each of the results (record) and run subquery giving an access to that record.

Let’s see how to use it in practice.

Definition of orders table

Let’s say that we have a table with orders that looks like this:

and have data:

Data in orders table

Now, we want to know what is the time of the first order created by each user and what is the ID and time of the next order. LATERAL join is a great solution here.

Solution could be like this:

SQL query with LEFT JOIN LATERAL

Let’s analyze each parts.

We want to present user_id, first_order_time, next_order_time, id (SELECT user_id, first_order_time, next_order_time, id FROM) taking only first created order by each unique user (SELECT user_id, min(created_at) AS first_order_time FROM orders GROUP BY user_id) o1). To have an access to that data in the subquery we gave them an alias o1 . 
After that we use LEFT JOIN LATERAL which means we iterate over each of the element from o1 and we are executing subquery.

In the subquery we are selecting ID and time of the next order for each user. We can do this by using a condition created_at > o1.first_order_time which will use first_order_time from the first query (LATERAL join gives us that opportunity).

As a result of running that query, we will see:

Results of running query with LEFT JOIN LATERAL

User with ID=2 has created only one query so he doesn’t have data for next_order_time and id. To fix it, we will change LEFT JOIN LATERAL into INNER JOIN LATERAL to consider only those rows which have corresponding rows in second table.

After that change our final SQL query looks like this:

SQL query with INNER JOIN LATERAL

Which gives us the results we want:

Results of running query with INNER JOIN LATERAL

Want to know first about new articles from that blog?

Subscribe to my newsletter now! — http://eepurl.com/cVPm_v


If you like this article and consider it useful for you, please support it with 💚