Drupal: The Case of the Ugly URLs

Chris Hill
4 min readApr 17, 2023

--

tl;dr:

  1. Drupal doesn’t load its path alias system when executing update hooks, so don’t try to save entities in update hooks. If you load a node’s URL during an update hook, you’ll get the internal path (“/node/123”) instead of its alias (“/pictures/cute-cats”).
  2. Nor does it load aliases in post-update hooks (contrary to the docs), so avoid it for saving entities, too.
  3. You can, however, safely use Drush’s hook_deploy_name() for saving entities. All systems are available during this stage.

Long version:

Like many debugging adventures, this one started with an innocent question: Why do some search results link to the node’s internal path, not the node’s alias? My answer: I don’t know. Let’s reboot it (technically, re-index the content into our search index, so that the URLs use the path aliases). That worked — Now the search results URLs are aliased. Case closed.

But a few weeks later, we noticed the same problem. Time to investigate.

This website uses Drupal 9, and it indexes nodes into a Solr server, using the Search API and Search API Solr contrib modules. One of the fields we index is the absolute URL to the node’s page. This is the field that sometimes contained the un-aliased path.

The first step to fixing a problem is understanding it. Reproducing the problem is a critical step to understanding. But I couldn’t reproduce it. Switching gears, I inspected the code that generated these URLs. All the code used Drupal’s APIs for building URLs, nothing special.

What now? Fortunately, we got a break when we realized that these un-aliased records’ timestamps correspond to recent deployments (our Solr server adds a timestamp field to each node document to track when it was last indexed). In each of those deployments, we re-saved groups of nodes using update hooks (see code example #1 below), which triggered them getting re-indexed into Solr. OK, let’s try that. In my test environment, I re-saved some nodes and expected to reproduce the problem, but the URLs were properly aliased. Phooey.

Remember that we used update hooks? There was this nagging warning from the documentation: “Be careful about… CRUD operations that you use in your update function.” Why not? I have used update hooks to re-save nodes for a long time and have never had a problem.

Up to this point, I hadn’t actually debugged this in an update hook. So, I added a new update hook and used it to re-save some nodes. Sure enough, the search indexer generated the un-aliased path. Success! But why does it do this?

Turns out that, starting in Drupal 8.6, the database update system stopped loading the path alias system (see d.o issue 3006086). In other words, Drupal doesn’t use path aliases when it’s executing update hooks.

That nagging warning was justified. I shouldn’t have saved nodes in an update hook. But is there a better way? What about hook_post_update_NAME()? Per its docs, “Drupal is already fully repaired so you can use any API as you wish”. I moved my code to a post-update hook (see code example #2 below), but I got the same result. The node URLs did not have aliases. The alias system is still unavailable during post-update hooks. (I opened bug report issue 3332063 to clarify the documentation on that hook.)

Is there a safe phase to run update nodes in a deployment? Yes! Drush recently added hook_deploy_NAME(), which runs during drush deploy (this runs after regular and post-update hooks and after config has been imported. At this point, your deployment’s changes are all in effect, so you can safely run code to update nodes). I moved my code to this phase (see code example #3 below), and everything worked great. Drupal generated the nodes’ URLs with path aliases included, and it indexed those into the search system.

I learned a lot during this investigation:

  • Investigate that “innocent” error. Pull on that thread! At the least, you’ll learn more about how your systems work.
  • Respect the warnings. I knew that we weren’t supposed to do CRUD operations in the update hook phase, but I never realized why until this investigation.
  • Always be ready with your debugger. My investigations would’ve been nearly impossible without a tool like Xdebug. Have a debugger and know how to use it.

For further reading about update and deploy hooks, check out https://www.hashbangcode.com/article/drupal-9-different-update-hooks-and-when-use-them.

Code examples

Code example #1

An “update hook”, from a module named my_site. This is in the file my_site.install .

Do not save content entities (e.g., nodes) during this stage.

use Drupal\node\Entity\Node;

/**
* Implements hook_update_N().
*/
function my_site_update_9000(&$sandbox) {
$node_ids = ['123', '124'];
$nodes = Node::loadMultiple($node_ids);
foreach ($nodes as $node) {
$node->set('title', $node->label() . ' (code example #1)');
$node->save();
}

return 'Re-saved nodes 123 and 124';
}

Code example #2

An “post-update hook”. Notice that its contents are the same, but it’s in a different file (my_site.post_update.php) and uses a different hook ( hook_post_update_NAME )

Again, do not save content entities during this stage.

use Drupal\node\Entity\Node;

/**
* Implements hook_post_update_NAME().
*/
function my_site_post_update_001__re_save_nodes(&$sandbox) {
$node_ids = ['123', '124'];
$nodes = Node::loadMultiple($node_ids);
foreach ($nodes as $node) {
$node->set('title', $node->label() . ' (code example #2)');
$node->save();
}

return 'Re-saved nodes 123 and 124';
}

Code example #3

A “deploy hook”. This is executed by Drush’s “deploy” command, which is a helpful way to deploy updated code to a running site. This code is in my_site.deploy.phpand implements hook_deploy_NAME.

In contrast the previous two examples, you can save safely content entities during this stage.

use Drupal\node\Entity\Node;

/**
* Implements hook_deploy_NAME().
*/
function my_site_deploy_001__re_save_nodes(&$sandbox) {
$node_ids = ['123', '124'];
$nodes = Node::loadMultiple($node_ids);
foreach ($nodes as $node) {
$node->set('title', $node->label() . ' (code example #3)');
$node->save();
}

return 'Re-saved nodes 123 and 124';
}

--

--