Drupal caching debugging part 1: Caching layers
When it comes to caching, sooner or later (like some variant of Godwin’s law ) someone will refer to Phil Karltons paradigm:
There are only two hard things in Computer Science: cache invalidation and naming things.
Since Drupal 8 there is an elaborate caching system, I would say that with respect to caching there are two hard questions:
- How do I make sure my data is cached?
- How do I make sure my data is not cached?
In most cases Drupal will handle the first issue for us. For anonymous users most of the pages will be cached, as we will see in more detail in the section about caching layers. The second issue is more or less caused by the first: if in our setup most data comes from cache for anonymous users, how do we make exclude the parts we do not want to be cached?
But most of all, if cache invalidation is such a hard thing, we must have some way to determine if we have a caching problem. In Drupal, solving the problem is in most of the cases not such a hard thing. Recognizing a problem is a lot harder.
Debugging caching problems is a very broad subject and won’t fit in a single blogpost. So in this one I will concentrate on the different layers of caching.
The different layers of caching
Visiting a website is a process that involves a large number of applications, each of which can (and usually does) cache data. The important question is if a layer causes caching bugs and if so, if we can somehow influence the behaviour of the layer. Let’s take a look at each layer, starting at the layers where it all starts, (or, for the less customer oriented developer, ends): the browser.
Browser cache
With all the caching layers that will follow, browser cache almost seems something from the days of 2400 baud modems. But even now not everyone has broadband internet, and even if people have it at home, they won’t have it while underway. And more importantly, the default operation modus of browser is to cache the data they receive so we must deal with this caching in some way.
To control this cache layer we can use the Cache-Control HTTP header. The most useful is the ‘max-age’ directive that defines the maximum amount of time in seconds that responses are allowed to be used again. A value of max-age=0 indicates that the cache entry requires re-validation.
To prevent caching completely the ‘no-cache’-directive seems a likely candidate but ‘no-cache’ does not mean do not cache at all. It means ‘revalidate with server’ before using any cached response the browser may have, on every request. So if any of the other layers indicates the cache data is not stale, the browser will use it.
Let go or dive in?
Apart from setting the max-age parameter
$config['system.performance']['cache']['page']['max_age'] = <time in seconds>;
there is not much you can do here and, most likely, the problem you try to solve does not originate in this layer but one of the following.
Reverse proxy
If we disregard any caching by smart routers or proxies in (corporate) networks, the reverse proxy is the next caching layer we are most likely to encounter. Of the reverse proxies Varnish is one of the most used. In the test setup discussed in a later part of this series, we will use DDEV with a Varnish container to test, but the same principles will apply to any other reverse proxies.
For Drupal the purge-module (https://www.drupal.org/project/purge) is a generic and (as the project page states) modular external cache invalidation framework suitable for any reverse proxy. For Varnish it should be used in combination with the purge_purger_http
-module, see Use Drupal 8 Cache Tags with Varnish and Purge for a good introduction by Jeff Geerling.
Let go or dive in?
Unless you are setting up a new server, or are really suspicious about your sysop, it is not very likely that the problem you are trying to solve stems from the Varnish configuration.
Some knowledge of the Varnish-tools, like varnishlog
and varnishadm
can help to find the deeper problem, but I would not recommend to let yourself be drawn into the world of vcl's and Varnish-debugging, unless of course you are looking for a new, time -consuming late-night hobby.
DON'T BUG ME: HOW TO DEBUG AND FIX VARNISH gives a very basic starter on debugging Varnish, and recommends coffee while doing so. Achieving a high hitrate is a more elaborate introduction, but however does not recommend coffee.
Opcode cache
Since we are discussing Drupal and Drupal is a PHP/Symfony application, using an opcode cache will improve your site performance, but it will not be the killer app that solves al your problems. On the other hand, the way these work and the fact that they have become so stable, make it very unlikely that they are the source of the problem.
Let go or dive in?
Like with Varnish, most of the time you can trust your sysop and take these for granted. If you are looking for a new hobby though, Varnish is much more fun.
Application cache
The caching in the application, in our case Drupal, is at the base of good performance. Of course all the layers mentioned above are important, and I really do not recommend running any serious site without them, but the application caching layer is the most relevant because for us, Drupal developers. And it is the layer in which we can make a real difference.
But it also means we have to accept that the fault is on our side, and not with the sysop or hosting company (or the customer who has employed them). I personally really find life much easier to start with accepting this, and only point to the other (p)layers when I have solid proof the fault lays there.
Let go or dive in?
Dive! Dive! Dive! Its not only fun, it will also make your site better.
Drupal Internal Page Cache
The Internal Page Cache-module implements caching for anonymous users. The code for this module can be found at core/modules/page_cache
and since it is very well documented (the comments almost exceed the actual code) it is a good starting point if you want to learn more about this cache layer.
If you look at it you will find the code is very compact, the main reason for this is of course that it uses an implementation of Drupal\Core\Cache\CacheBackendInterface
to do the actual work. The logic about when to store or retrieve the cached data however is implemented in the Drupal\page_cache\StackMiddleware\PageCache-class
.
The module defines two services but don't be tempted to use these directly, in almost all cases using cache-tags and -contexts, provided by the Internal Dynamic Page Cache, is a far cleaner and better solution.
Drupal Internal Dynamic Page Cache
The Internal Dynamic Page Cache, found in core/modules/dynamic_page_cache
, is a smarter caching implementation and also caches data for authenticated user, or, as the module page on Drupal.org states:
It caches pages minus the personalized parts, and is therefore useful for all users (both anonymous & authenticated).
This cache implements, unlike the Internal Page Cache, the EventSubscriberInterface
because "(...)many cache contexts can only be evaluated after routing".
Although the Internal Page Cache is an excellent way to improve performance in a zero-configuration way (okay, there is one setting on /admin/config/development/performance
) for anonymous users, the real power comes with the implementation of cache-tags and cache-context.
And, as always, with real power come real problems.
Drupal Render cache
Since Drupal 8 Drupal has a smart rendering system that caches (parts of) render arrays and ensures not all HTML is rendered every time, and, more importantly is rendered new when involved content is changed.
The render cache is implemented in:
core/lib/Drupal/Core/Render/RenderCache.php
This file contains large stretches of comments, especially in the ‘set’-function, absolute worthwhile a read.
In the debugging part of this serie we will go into more details of the render cache. For now it is good to know that to each render array cache-tags and -context can be added and that this provides a fine-grained caching.
Drupal cache backends
In Drupal core a number of cache backends are implemented:
- \Drupal\Core\Cache\ApcuBackend
implemented in \Core\Cache\Apcu4Backend - \Core\Cache\MemoryBackend implemented in \Drupal\Core\Cache\MemoryCounterBackend and \Drupal\Core\Cache\MemoryCacheMemoryCache
- \Drupal\Core\Cache\PhpBackend
- \Drupal\Core\Cache\DatabaseBackend
The last one, DatabaseBackend
, is Drupal's default cache implementation. In special cases you can use one of the other implementations, or use a module, like the Redis-module, to get even more implementations.
Besides the implementations mentioned above there are a number of special implementations:
- \Drupal\Core\Update\UpdateBackend
This is a special implementation for use during Drupal database updates. - \Drupal\Core\Cache\BackendChain
This implementation makes it possible to combine two different backends, for example a really fast one with limited resources (like MemoryBackend) and a slower one (like DatabaseBackend), which is addressed when the cached item is not found in the first one. - \Drupal\Core\Cache\ChainedFastBackend
The ChainedFastBackend is a special implementation of the BackendChain, especially suited for website runnnig on multiple nodes. While BackendChain works on the assumption that both backends are consistent, ChainedFastBackend does not assume the fast-backend is consistent (that is, in sync with all nodes).
To check if it can serve the items from the fast cache, it keeps track of the last time items were written to the slower, consistent backend and only serve the items from the fast backend if they were created before this timestamp. - \Drupal\Core\Cache\NullBackend The most common use case for this backend is during development. Because clearing the cache after each change is time consuming most Drupal developers enable this backend in the local settings:
$settings['container_yamls'][] = '../settings/services.local.yml';
$settings['cache']['bins']['render'] = 'cache.backend.null';
$settings['cache']['bins']['page'] = 'cache.backend.null';
$settings['cache']['bins']['dynamic_page_cache'] = 'cache.backend.null';
The services.local.yml
is necessary to define the cache.backend.null
services, like this:
services:
cache.backend.null:
class: Drupal\Core\Cache\NullBackendFactory
And can als be used to enable a number of other useful debugging functions, like twig-debugging and sending headers to debug the cacheability of pages.
Pitfall number one
The NullBackend implementation makes development a lot smoother, but, as you probably can see, is also a very large pitfall. Code that (seems) to be working perfectly on the local development environment or maybe even on the test server, can produce unexpected and unwanted behaviour that will be hard to debug, especially if this must be done after some time.
Even the most concise customer or tester will not always be aware of caching problems or how to test them. And in most cases this will even be a larger problem if someone other than the developer who has written the code has to debug it later on. Maybe this develop will profit from the next part of this series: Debugging caching.
Louis Nagtegaal is all round digital specialist at LimoenGroen with a strong focus on back-end development. LimoenGroen (Dutch for ‘lime green’) is a team of passionate people working with Drupal for a broad range of clients. We provide user-centric, scalable websites and digital solutions that will add lasting value to organizations. With a strong focus on web accessibility and open source, we care to share. We are based in Amsterdam, The Netherlands. We have worked for many top Dutch and international brands including IUCN, Fox and Van Gogh Museum. Would you like to know more about meta tags, accessibility or web development? Contact Louis at louis@limoengroen.nl.
Photo by Florian Krumm on Unsplash