Caching Large Navigation Menus In Drupal

Isovera Staff
Isovera
Published in
9 min readJan 10, 2019

--

I have been working with Marco Villegas on a Project for Pegasystems. We are importing documentation into Drupal from an external system. The documentation is organized in Drupal using the core Book module, and some of the books have thousands of pages.

The problem is that processing the navigation menu for such a large book takes a lot of time, on the order of a minute! This post describes how we leveraged the advanced caching system in Drupal 8 so that pages load in a reasonable amount of time without having to cache gigabytes of data.

Pega Community Documentation Page

Book navigation is nested several levels

The import project is not yet complete, so I cannot give a live link, but here is a screenshot of the navigation. Notice that most of the navigation is collapsed, but enough of it is open to show the path to the current page.

Significant Numbers

  • 4472 pages in the book
  • 2.7 MB rendered (twice) for each page
  • 40–50 sec initial load
  • 6–9 sec load after caching

The navigation is rendered twice on every page: once for desktop and once for mobile. We should dos something about that, but not today.

After we did this work, the cached page loads in 2–3 sec. Normally, I would not brag about times like that, but it is a lot better than it was. (Also, the 6–9 seconds relied on some earlier caching work.)

I think that Emily Dickinson would understand how I feel about these load times:

The Heart asks Pleasure — first —
And then — Excuse from Pain —

Strategy

Cache the navigation once per book

If we cache each page of the book, that means about 10 GB of cache. It also means that we spend a lot more time generating the navigation menu than if we generate it once per book.

Set active trail with javascript

The problem with caching the navigation once per book is that we need to customize it for each page, opening up the path to the current page. So let’s customize it per page with javascript (client side).

Implementation

Here is a simplified version of the Twig Template that creates the navigation block:

{% if tree %}
<nav class="c-book-nav" role="navigation" aria-labelledby="book-label-{{ book_id }}">
<a href="{{ book_url }}">
{{ top_book_title }}
</a>
{{ tree }}
</nav>
{% endif %}

The important part is the Twig variable at the end: {{ tree }} is the part we have to compute and cache. Also notice that we already have some CSS classes that we can target with jQuery.

Hook Node View

From the screenshot above, you might think that the navigation is in a block, placed in the sidebar region. In fact, the caching would be a little simpler if that were the case. The way the site is built, it is actually in the main page array.

Here is the code that adds a render array for the book navigation. After a few checks to make sure that it should be added, it creates a simple render array with a custom #theme and a single parameter. That is, it depends on the current book ('#book_id' => $book_id) but not on the current page.

function pdn_book_node_view(array &$build, NodeInterface $node, EntityViewDisplayInterface $display, $view_mode) {
if ($view_mode != 'full') {
return;
}
if (empty($node->book['bid']) || !empty($node->in_preview)) {
return;
}
$book_id = $node->book['bid'];
$book_node = Node::load($book_id);
if (!$book_node->access()) {
return;
}
// Cache the navigation block once for the entire book.
// We will set the active trail client-side.
$build['book_nav'] = [
'#theme' => 'book_nav',
'#book_id' => $book_id,
'#weight' => 100,
'#cache' => [
'keys' => ['pdn_book_nav', $book_id],
'contexts' => ['languages'],
'tags' => ["node:$book_id"],
'max-age' => Cache::PERMANENT,
],
];
}

I will explain the #cache parameters below. (If you want, you can skip to the section “Tell Drupal how to cache the navigation”.)

Hook Theme

This is pretty standard, but for completeness here is the definition of the custom theme function. Again, there is only one parameter, the book ID. The Twig template is the one I showed above, book_nav.html.twig.

function pdn_book_theme($existing, $type, $theme, $path) {
return [
'book_nav' => [
'variables' => [
'book_id' => 0,
],
],
];
}

Preprocess Function

This function takes the single book_id parameter provided to the theme function and adds the other variables used in the Twig template, including $variables['tree']. This function was already in the code before we started working on it. It is based on some code already in the core Book module.

function template_preprocess_book_nav(&$variables) {
/** @var \Drupal\book\BookManager **/
$book_manager = \Drupal::service('book.manager');
// Get the nested array (tree) of menu links.
$book_tree = $book_manager
->bookTreeAllData($variables['book_id']);
// Generate a render array from the tree of links.
$tree_output = $book_manager
->bookTreeOutput(array_shift($book_tree)['below']);
$variables['tree'] = $tree_output;
$variables['book_url'] = \Drupal::url(
'entity.node.canonical',
['node' => $variables['book_id']]
);
$book_node = Node::load($variables['book_id']);
$variables['top_book_title'] = $book_node->getTitle();
$variables['top_book_empty']
= !$book_node->hasField('field_body')
|| $book_node->get('field_body')->isEmpty();
}

Javascript

Here is the javascript that opens up the path to the current page. Since jQuery is very good at traversing the DOM, this ends up being a lot simpler than the PHP code we used previously.

The second half of this snippet was already there. We just added the part that finds the <nav class="c-book-nav">, looks inside it for a link to the current page, and then adds class="active" to that link and its parents and class="c-book-nav--list-expanded" to the parent <li> elements.

Drupal.behaviors.bookNavExpand = {
attach: function attach(context) {
var bookNav = $('.c-book-nav', context);
$('a[href="' + context.location.pathname + '"]', bookNav)
.addClass('active')
.parentsUntil(bookNav, '.c-book-nav--list-expandable')
.addClass('c-book-nav--list-expanded')
.children('a')
.addClass('active');
$('.c-book-nav--list-expanded > .c-book-nav--list', context)
.once('bookNavExpandInit')
.css('display', 'block');
$('.c-book-nav--expand-arrow', context)
.once('bookNavExpandClick')
.on('click', function() {
$(this).parent().toggleClass('c-book-nav--list-expanded');
$(this).siblings('.c-book-nav--list').slideToggle();
});
}
};

There is room for improvement here. It is a little inefficient to traverse the DOM twice (once to setclass="c-book-nav--list-expanded" and a second time to set display="block" on those elements). We decided to KISS for now: just add our 7 lines of javascript and not touch what was already there.

Tell Drupal how to cache the navigation

Here again is the render element we added to the page.

$build['book_nav'] = [
'#theme' => 'book_nav',
'#book_id' => $book_id,
'#weight' => 100,
'#cache' => [
'keys' => ['pdn_book_nav', $book_id],
'contexts' => ['languages'],
'tags' => ["node:$book_id"],
'max-age' => Cache::PERMANENT,
],
];

Now let’s look at the four entries in the #cache sub-array.

Cache Keys

'keys' => ['pdn_book_nav', $book_id],

We provide two cache keys:

  • A unique string to identify “our” cache entries.
  • The book ID.

This is how we cache once per book.

Without cache keys, any other cache data will bubble up to the page render array, but our render array will not be cached by itself, which is what we want. If the book navigation were in a block, then the block would be cached and we would not have to supply cache keys.

Cache Contexts

'contexts' => ['languages'],

If the book is viewed in another language, then the link text will change, so we need to tell Drupal to store a separate copy for each language. Maybe the link URLs will also change, depending on how we manage languages.

In fact, this site is not (yet) multilingual, so we are trying to be a little proactive.

A drawback to the once-per-book strategy is that the navigation menu will not update if any individual page is updated, say with a new title. This is not a problem for books imported from an external system, but the site has other books as well. We may decide to add the 'route.book_navigation' cache context, if this does not affect performance badly. See Cache contexts in Drupal’s Cache API documentation.

Cache Tags

'tags' => ["node:$book_id"],

This tells Drupal that when node/$book_id is updated, it should delete the entry from the cache. This is related to how many different variants should be cached. For example, we might want to cache once per book but invalidate it if any page in the book is updated. Then we would include the book ID in the cache keys and we would add cache tags for each node in the book.

On my local copy of the site, the cache tags are stored in the database, where I can examine them. (See below.) On production, they might be handled by memcache. At the page level, cache tags are sent in HTTP headers, so that Varnish or a CDN can invalidate pages based on cache tags.

Cache Max Age

'max-age' => Cache::PERMANENT,

This tells Drupal to keep the cached version until we say to clear it.

Peek At the Database

On my local copy of the site, the render cache is stored in the database, so we can see the results of these settings with a few queries. On production, this cache is handled by memcache.

The cache_render Table

Here is the relevant database table:

mysql> DESCRIBE cache_render;
+------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| cid | varchar(255) | NO | PRI | | |
| data | longblob | YES | | NULL | |
| expire | int(11) | NO | MUL | 0 | |
| created | decimal(14,3) | NO | MUL | 0.000 | |
| serialized | smallint(6) | NO | | 0 | |
| tags | longtext | YES | | NULL | |
| checksum | varchar(255) | NO | | NULL | |
+------------+---------------+------+-----+---------+-------+
7 rows in set (0.01 sec)

Query

After clearing caches and viewing one page, there is just one entry matching the unique string we supplied as a cache key. I have added some whitespace to make this easier to read. I did not include the data column, since that would have been overwhelming. I skipped serialized: it is a boolean that says whether the data is a simple string or a serialized PHP variable.

mysql> SELECT cid, expire, created, tags, checksum
FROM cache_render
WHERE cid LIKE 'pdn_book%'
LIMIT 0,1\G
********************** 1. row **********************
cid: pdn_book_nav:
704369:
[languages]=en:
[theme]=pegawww_theme:
[user.permissions]=4f64d6e20026c96e963d91bab0192f9824e8cb2e9352eb4c1ca18d78478abfdb
expire: -1
created: 1543638198.782
tags: config:system.book.704369 node:704369 rendered
checksum: 12
1 row in set (0.00 sec)

Cache ID (cid)

This identifies the cached item.

  • We specified pdn_book_nav in the cache keys.
  • The book ID (704369) also comes from cache keys.
  • languages comes from cache contexts.
  • theme and permissions are default contexts: see below.

Cache Max Age (expire and created)

Since we specified 'max-age' => Cache::PERMANENT, in the cache settings, the expire column is set to -1. If we had specified 86400 (one day) then the expire value would have been 86400 more than the created value. (I should check this.)

Cache Tags

Again the cache tags describe when this entry should be purged. I am happy to see node:704369 (the book ID), but I am not sure where the other tags are generated.

Permissions Hash

I am punting on some of the cache tags, but I promised to explain where two parts of the cache ID come from. See sites/default/services.yml:

parameters:
renderer.config:
# Renderer required cache contexts:
#
# The Renderer will automatically associate these cache
# contexts with every render array, hence varying every
# render array by these cache contexts.
#
# @default ['languages:language_interface', 'theme', 'user.permissions']
required_cache_contexts:
- 'languages:language_interface'
- 'theme'
- 'user.permissions'

This shows that we did not have to specify 'languages' in the cache contexts: it is already added by default. It also explains why the theme and the user permissions appear in the cache ID.

Conclusion

Our main goal was to improve page-load times: they started out terrible and now they are merely bad, maybe even fair. As a bonus, I learned a little about how the cache system works in Drupal 8. Comparing the settings we provided in the render array to what gets stored in the database helped to de-mystify the system for me.

Try it yourself! In order to experiment with the cache settings, you can skip the theme function and the Twig template; just build your render array directly. Try setting a different max age, or adding cache contexts, and see how it affects what is saved in the database.

References

I already mentioned one reference:

That is part of the Cache API guide on drupal.org.

The other reference I found most helpful for explaining the importance of cache keys is

in the API documentation.

These two pages in the Render API guide are also useful:

This article was originally published by Benji Fisher on our Isovera blog.

--

--