Drupal and HTTP/2: Time To Experiment
Written by Geoff Appleby
HTTP/2 is pretty cool, with the potential to reduce resource usage and increase performance of the web. It’s supported by Firefox, Chrome, and Edge, as well as Safari in OSX 10.11+. The web servers Apache and Nginx support it. CDNs like Cloudflare and Akamai support it. The availability of free TLS certificates from Let’s Encrypt removes cost as a barrier to wider implementation. HTTP/2 also introduces some new features for applications to make use of, and I wanted to explore the possibilities for Drupal 8 to take advantage of the changes and improvements.
HTTP/2 sends data over a single connection with a stream for each resource, avoiding the overhead of creating multiple connections and the limits on concurrent connections. The spec recommends a minimum limit of 100 concurrent streams, which is a significant increase over the limit of 6–8 connections that browsers currently enforce. In order to prevent a large number of concurrent streams from causing delays as they fight for bandwidth, stream priorities give the browser improved control over which items it would like the server to send first without delaying later requests if one is held up. If every required asset is specified individually rather than concatenated, perceived performance should be improved, since only uncached data actually needed for the page will be downloaded.
Current recommendations are typically to continue concatenating assets to avoid penalizing older clients until HTTP/2 adoption is great enough. However, this limits the ability to take the best advantage of the HTTP/2 capabilities of newer browsers now, and will still affect older browsers once the switch is made. I wanted to explore the possibility of serving pages to users based on the optimal experience for their browser’s capability through a module for Drupal 8.
The Drupal Module
Drupal 8’s improved caching system uses Cache Contexts to ensure that items are cached according to what makes each component unique, allowing the dynamic page cache to only process and render the parts of the pages that are unique to each request. Some example contexts are the language of the component or whether the component varies for authenticated versus anonymous users. In this case, it’s necessary to create a cache context for whether the request was made with HTTP/2 based on the data provided from the web server.
Is this really a good idea?
Drupal optimizes CSS a little bit by stripping comments and removing extra whitespace. Since all optimizations are turned off for HTTP/2 requests, that means they will transfer a small amount more data. Ideally HTTP/2 requests could still have the same minification optimizations applied to individual assets, so that there is minimal difference in transfer size.
The dynamic page cache doesn’t cache the page attachments markup, so it will generate the one component of the page that does vary between protocols on every request without needing to be aware of the request’s protocol itself. There is a problem with Drupal’s anonymous page caching, though, as it operates solely on the URL of the request, completely ignoring the protocol, and serves whichever version of each URL was first requested and cached. As visitors navigate through the site they could alternate between pages that are optimized for each of the two protocols, increasing the amount of data that everyone has to download. The page cache doesn’t currently allow hooking into the criteria for caching page variants, so a customized page cache service would be required to overcome this limitation.
External caching layers like Varnish or Cloudflare will have the same issue as the internal page cache, since they also operate primarily based on URL. The standard way to communicate the criteria by which a page may differ beyond the URL is for the server to specify a Vary header, which specifies headers that the browser provides such as ‘Accept-Encoding’ or ‘Cookie’. Using the Vary header, the browser and other caches can determine if a request can be served from existing data or requires a new request to the server. Since the connection protocol and the HTTP headers operate at different network layers, though, there isn’t a header that can be added to Vary to differentiate HTTP 1.1 and HTTP/2 requests. Additionally, while browsers would connect to the caching layer with different protocols, the caching layer will only connect with the origin server over a single protocol. A custom Varnish server may be configurable to pass on the protocol from the browser and cache the corresponding variants, but more generic solutions like Cloudflare won’t have that capability.
Ultimately this narrows any benefits of providing HTTP/2 specific content to sites that are unable to make use of Drupal’s internal page cache and don’t use any external cache, which could have significant performance implications that would dwarf the benefits of any content optimizations. The current recommendation of continuing to concatenate assets until HTTP/2 adoption is widespread still stands, unless you’re interested in experimenting and have significant control over all of the layers between your web server and the user’s browser.