CakePHP Window functions and Common Table Expressions

CakePHP 4.1.0 is going to be released soon. Thanks to othercorey and ndm2 4.1.0 adds powerful new features to the ORM — Window functions and Common Table Expressions.

Window functions

Window functions enable you to generate aggregations for ‘windows’ of your data. You define what the ‘window’ is by telling the database how to partition your data into frames around the current row, and perform aggregations across the rows in each frame.

Some example problems you could solve with window functions are:

  • Getting the percentage of total views for each article an author has written.
  • Getting a running total for timeseries, or financial data.
  • Computing values based on data in adjacent rows. For example computing
    time between train stations, or free time in a person’s calendar.

A minimal example of window functions using the schema from the CMS tutorial would be:

$query = $this->Articles->find();
$query->select([
'Articles.title',
'Articles.user_id',
'Articles.views',
'running_total_views' => $query
->func()
->sum('Articles.views')
->partition('Articles.user_id')
->order('Articles.views ASC')
->rows(0, null),
])
->orderDesc('Articles.user_id')
->orderDesc('Articles.views')
->disableHydration();
// Each row will have a `running_total_views` key.
debug($query->toArray());
// The output of the above would be:
[
[
'title' => 'First Article',
'user_id' => 1,
'views' => 1,
'running_total_views' => 1,
],
[
'title' => 'Latest Article',
'user_id' => 1,
'views' => 4,
'running_total_views' => 5,
],
]

You could get the average views using a window function like this:

$query = $this->Articles->find();
$query->select([
'Articles.title',
'Articles.user_id',
'Articles.views',
// get average views
'average_views' => $query
->func()
->avg('views')
->partition('Articles.user_id'),
])
->orderDesc('Articles.user_id')
->orderDesc('Articles.views')
->disableHydration();
// The output of the above would be:
[
[
'title' => 'First Article',
'user_id' => 1,
'views' => 1,
'average_views' => 2.5,
],
[
'title' => 'Latest Article',
'user_id' => 1,
'views' => 4,
'running_total_views' => 2.5,
],
]

Window functions can be a bit opaque to work with at times, but we can use a GROUP_CONCAT to dump out the data each window is looking at:

$query = $this->Articles->find();
$query->select([
'Articles.title',
'Articles.user_id',
'Articles.views',
// Look at windows
'debug' => $query
->func()
->aggregate(
'group_concat',
['Articles.views' => 'literal', ','],
[],
'string'
)
->partition('Articles.user_id')
->rows(0, null),
])
// The output of the above would be:
[
[
'title' => 'First Article',
'user_id' => 1,
'views' => 1,
'debug' => '1',
],
[
'title' => 'Latest Article',
'user_id' => 1,
'views' => 4,
'debug' => '1,4',
],
]

Common Table Expressions

Common table expressions which are often abbreviated to ‘CTE’ enable you to create intermediate result sets and then use those tables to build queries on. They serve a similar purpose to database views or derived tables. However, they differ from derived tables and views a few ways. First you don’t have to maintain schema for a common table expression. Second, you can reference the results of a common table expression multiple times without incurring performance penalties, unlike derived table joins.

As an example, we want to fetch a list of tags, and the number of articles each one has.

$query = $this->Articles->find();
$query->with(function (CommonTableExpression $cte, Query $query) {
$q = $query->cleanCopy();
$q->repository($this->Articles->Tags->junction());
$q->select([
'article_id' => 'article_id',
'tag_count' => $q->func()->count('*'),
])
->from('articles_tags')
->group(['article_id']);
return $cte
->name('tag_counts')
->query($q);
});
$query->select([
'title',
'tag_count' => 'tag_counts.tag_count'
])
->join([
'tag_counts' => [
'table' => 'tag_counts',
'type' => 'LEFT',
'conditions' => 'tag_counts.article_id = Articles.id'
],
])
->disableHydration();
// The above would output
[
[
'title' => 'First Article',
'tag_count' => 4,
]
]

I have found common table expressions most useful in reporting queries when I need to break up a big complex query into easier to understand blocks, or merge the results of multiple aggregate queries together. The ORM will support common table expressions in SELECT, INSERT, UPDATE and DELETE operations. Window functions, and common table expressions work with Postgres, SQLServer, SQLite and MySQL 8.0+. I’m really happy to see the CakePHP ORM continuing to evolve and support advanced SQL features. While my examples are simple they illustrate how straightforward the new APIs introduced in 4.1.0 are to use, and I hope they help you build even better applications.

Written by

Principal Developer @ Sentry, Creator of Stickler-CI and CakePHP core team member.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store