Drupal 8 and reverse proxies: The $base_url drama

tl;dr: Overriding the $base_url in Drupal 8 is not possible like it used to be in prior Drupal versions, but you may not need it (in most cases).

A bit of theory

In a basic web stack setup, web clients interacts with a web server directly. The server knows firsthand who its clients are and what they want. This information is available via a set of global server variables in PHP: REMOTE_ADDR, HTTP_HOST, REQUEST_SCHEME, REQUEST_URI, etc.

In a scalable production environment you’ll have a CDN/load balancer/caching reverse-proxy in-front of web nodes:

Reverse proxy (image src)

A web server no longer directly interacts with clients. Instead, it sees the reverse proxy as its only client. The internal host pointed to the web server (e.g. web01.example.com) is not the same as the external one pointed to the proxy (www.example.com). It’s also not uncommon for proxies to handle HTTPS termination (offloading) and forward traffic over HTTP downstream. As the request arrives at the application layer, the information about the client and the request details may be totally skewed at that point.

Request routed via reverse proxy

httpS://www.example.com => http://web01.hosting.local:32831

Request details: as seen by reverse proxy

REMOTE_ADDR: 54.209.25.207
REQUEST_SCHEME: https
HTTP_HOST
: www.example.com

Request details: as seen by web server / application

REMOTE_ADDR: 172.17.0.1
REQUEST_SCHEME: http
HTTP_HOST
: web01.hosting.local:32831

So how in the world would your application know the real client IP and how will it be able to construct a correct external absolute URL using this information? Let’s start with the latter one.

Figuring out the external base URL

The Drupal 7 way: Override the $base_url variable

One obviously way is to hardcode the $base_url. It’s not the most elegant, but a totally valid and bulletproof approach.

Drupal 7 had (and still has) an option for this in setting.php:

/**
* Base URL (optional).
*
* If Drupal is generating incorrect URLs on your site, which could
* be in HTML headers (links to CSS and JS files) or visible links on pages
* (such as in menus), uncomment the Base URL statement below (remove the
* leading hash sign) and fill in the absolute URL to your Drupal installation.
*
* You might also want to force users to use a given domain.
* See the .htaccess file for more information.
*
* Examples:
* $base_url = '
http://www.example.com';
* $base_url = 'http://www.example.com:8888';
* $base_url = 'http://www.example.com/drupal';
* $base_url = 'https://www.example.com:8888/drupal';
*
* It is not allowed to have a trailing slash; Drupal will add it
* for you.
*/
$base_url = 'http://www.example.com'; // NO trailing slash!

The bad news: this option has been removed in Drupal 8.

The good news: hardcoding the$base_url is no longer necessary in Drupal 8 in most common reverse proxy scenarios.

The Drupal 8 way: Use “X-Forwarded-*” request headers

Drupal 8 added support for a group of HTTP headers dedicated to handling the reverse proxy / CDN use cases. Proxies pass the original request information via these headers, so it can be read and interpreted at the application level.

Here they are:

The X-Forwarded-Host (XFH) header is a de-facto standard header for identifying the original host requested by the client in the Host HTTP request header.
The X-Forwarded-Proto (XFP) header is a de-facto standard header for identifying the protocol (HTTP or HTTPS) that a client used to connect to your proxy or load balancer. Your server access logs contain the protocol used between the server and the load balancer, but not the protocol used between the client and the load balancer. To determine the protocol used between the client and the load balancer, the X-Forwarded-Proto request header can be used.

Using our previous example, here’s what an equivalent curl command would look like using these headers:

curl \
-H "X-Forwarded-Host: www.example.com" \
-H "X-Forwarded-Proto: https" \
http://web01.hosting.local:32831

Figuring out the real client IP

While we do not need to know the real client IP to figure out the base URL, this is still part of the same use case and is solved in a similar way — via an extra request header.

Both Drupal 7 and Drupal 8 use the same standard approach when working behind a reverse proxy and/or a CDN — the X-Forwarded-For header.

The X-Forwarded-For (XFF) header is a de-facto standard header for identifying the originating IP address of a client connecting to a web server through an HTTP proxy or a load balancer. When traffic is intercepted between clients and servers, server access logs contain the IP address of the proxy or load balancer only. To see the original IP address of the client, the X-Forwarded-For request header is used.

It is worth noting, that HTTP headers are vulnerable to spoofing — any client can pass any HTTP request header with any value.

curl \
-H "X-Forwarded-For: <fake ip here>" \
-H "My-Own-Header: <nonsense>" \
http://web01.hosting.local:32831

To prevent the IP spoofing attack, access to web servers should either be restricted to a pool of know reverse proxy IPs (at the infrastructure level) or Drupal needs to know which remote hosts are known reverse proxies that it can trust.

This can be configured in settings.php using the following two variables.

Drupal 7

$conf['reverse_proxy'] = TRUE;
$conf['reverse_proxy_addresses'] = array('a.b.c.d', ...);

Drupal 8

$settings['reverse_proxy'] = TRUE;
$settings['reverse_proxy_addresses'] = array('a.b.c.d', ...);

In a scenario, where requests are routed through multiple proxy layers before reaching the web servers, each proxy will append the remote client IP it received the request from to the X-Forwarded-For header value.

Example request path:

Client => CDN => Load Balancer => Caching Proxy => Web Servers

The resulting header value structure will be the following:

X-Forwarded-For: <client-ip>, <cdn-ip>, <lb-ip>, <cache-ip>

Adoption by CDN providers

While X-Forwarded-* family of headers may be widely used, they are non-standard and not part of the HTTP protocol spec. As such, support for these headers by reverse proxies and CDNs varies.

X-Forwarded-* header support by CDN providers

X-Forwarded-For received the most adoption, followed by X-Forwarded-Proto. X-Forwarded-Host finishes last and here’s why.

Most CDNs would forward the original Host header to the origin server, this way eliminating the need of the extra X-Forwarded-Host header.

An equivalent curl command that demonstrates this:

curl -H "Host:www.example.com" http://web01.hosting.local:32831

An edge case for X-Forwarded-Host

I recently ran into an edge case, where passing the Host header did not work and X-Forwarded-Host header was the only viable solution for a Drupal 8 site.

The client wanted to have a single external domain pointed at the CDN (Akamai). The CDN then routed requests between two origins based on the country suffix in the URL. Origins were different site instances, but shared the same load balancer.

www.example.com/us (CDN) => origin-us.example.com (LB) => webs-us
www.example.com/ca (CDN) => origin-ca.example.com (LB) => webs-ca

With both sites sharing the same LB, we could not have the same Host value routed differently at the LB level (the LB did not support routing based on the request URI).

The CDN had to be configured to NOT forward the original Host header and instead send Host value via X-Forwarded-Host. Drupal 8 picked up theX-Forwarded-Host header and generated absolute URLs correctly.

We also had to create symlinks inside of the web site document roots to map the suffixes.

docroot
\_ ...
_ us -> .
_ ...

We made Drupal think it’s installed in a subdirectory. This way, it was including the country suffix in the absolute URLs it generated:

www.example.com/us/about-us — routes correctly everywhere.
www.example.com/about-us — won’t be routed correctly at the CDN level.

In Drupal 7, the same requirement can be handled by setting

$base_url = "https://www.example.com/us"

in settings.php and adding a rewrite rule in .htaccess:

Rewrite ^us/(.*)$ $1

There is still no good alternative provided in Drupal 8 to handle less common installation cases (see comments in here), where hardcoding $base_url used to be a simple and reliable solution. You may have to fiddle with rewrite rules or come up with some other workarounds.


I hope you found this article useful. Help others discover it on Medium by giving it some claps 👏 (the more — the merrier 😃) . Thanks for reading!