How to limit results of a database query per group using SQLAlchemy and PostgreSQL
Have you ever faced with a task when you needed to limit results of a SQL query per group? For example, select the three most popular articles of every author?
If you have not, think how you would to do it. Will you select all authors to your application and iterate through them selecting their three articles or write a PL/pgSQL function?
There are two much more elegant ways to do this using just one request: window function and lateral join. We will focus on the later in this article, but you can read about their difference here.
LATERAL is a PostgreSQL (≥ 9.3) keyword. When it appears before a subquery in a
JOIN clause, it allows the subquery to reference columns gotten in the outer scope. If the keyword is not present, PostgreSQL evaluates the subquery independently, and therefore it cannot help accomplish our goal.
We will use two database tables with such columns:
These tables can be created using this Python code:
Then we need to construct an SQLAlchemy expression which matches this SQL code:
Let’s start from the subquery. It can be defined using the code below, and it is just like a common SQLAlchemy query, it just calls the
.lateral('top_3_articles') method in the end:
Defining the main query is not hard too:
We are ready to execute the query and print its results: