Web performance secrets from the BBC

Jamie Knight reveals the techniques the BBC uses to speed up its site and help users flow from one page to the next

Last year during a user testing session for the BBC News app, one of the users made a comment that has really stuck with me. They declared: “I like to flow”. I don’t think there’s a better summary of what performance means to our users. On a fast app or website, the user can flow around, interact and engage with the content.

Flowing experiences are good for site owners too. A fast-flowing experience helps users achieve their goals, and in turn we achieve our organisations’ goals. Amazon and others have demonstrated the strong link between performance and user activity: as the wait for pages goes down, the amount of time and money the user spends goes up.

In this tutorial I am going to explore some of the techniques we have used to keep the BBC site fast and our users flowing easily. I will first look at preserving flow, then take a closer look at caching.

Key goals

Preserving flow within a site will help meet the needs of different users. There are two goals to bear in mind:

  1. Minimise pauses: Delays reduce the user’s focus and introduce a switch in context
  2. Prioritise content: Load the content the user cares about most first

To achieve both goals we must consider website performance holistically. Performance is about more than just how long it takes for a page to load — to preserve our users’ flow we need to consider the overall experience and where the pauses lie; then think about which trade-offs we should make to give the best experience.

Sometimes it’s possible to design out common pauses once we have a understanding of how our users consume our content. For example, if we have a tabbed interface and we know one tab is popular, loading the tab’s content with the rest of the page may give a better overall experience than lazy loading it on request. The first page load will be slowed, but the ‘instant’ tab load will make the interaction feel much smoother.

The network

No matter how we slice it, we need to get content across the network from our servers to the user. The network is one of the main problem areas for causing pauses, so it’s a good place to look in order to improve our users’ flow. There are many techniques for optimising how we work with the network — two common ones are to reduce page weight and to reduce request count. The less we use the network, the fewer pauses we expose our users to. However these optimisations all occur after the first request for the HTML file is made.

The request for the HTML file is the lower bound for page performance. Nothing happens until it has been completed, so it’s a good place for optimisation. Caching is powerful because it works for the HTML request. Plus, it supercharges all our other network usage optimisations.

Caching

Caches are created when a small amount of something is stored closer to where it is needed, normally to prevent rework. For example, if I am eating Skittles, I tend to pour a few into my hand and then eat from there. In effect, I am creating a cache of Skittles in my hand as it’s quicker to eat them that way than going back to the packet.

This same pattern is used in technology. There are three caches we have to consider:

  • Server caches: Cached data on the server, such as the results of database queries
  • Network caches: Caches built into the network, sometimes by the site operator (known as a reverse proxy cache), but more often by ISPs or other networking providers
  • Browser cache: The browser stores files on the user’s hard drive for reuse by the user

Caching can make for a huge performance improvement; at the BBC I have seen caching increase performance more than 20 times in production code. It is beneficial for site operators too. With caching, more users can be supported by the same hardware, or less hardware can be used to support a given number of users. This reduces the cost in hardware per user and therefore reduces website operating costs.

The News homepage uses a 30 second max-age cache header to get content in front of users quickly without too much load

Design for the cache

For it to be effective, we want to use cached data as much as possible. To extend the Skittles analogy, if I want a blue Skittle but I don’t have any blue Skittles in my hand (aka my cache), I will have to go back to the packet. This is known as the ‘hit rate’. It’s a ‘hit’ when the item is in the cache and a ‘miss’ when it’s not. We want a high hit rate so the cache takes most of the load.

One of the simplest methods to increase hit rate is to reduce variation. Stretching my Skittles analogy a bit, imagine if all Skittles were red. That way, any Skittle in my hand would be a cache hit; I would never need to go back to the packet. Applying this to the web, if we can give the same page to as many users as possible, the cache becomes more effective as more requests will hit the cache.

Caching HTML

So that’s the theory, let’s get practical. First, we need to tell the world that our pages are cacheable. Then we need to look at how to best use the cached pages to boost performance without losing functionality. For this deep dive, I will focus on network and browser caches.

Let’s start by looking at caching the request for the HTML. Caching of all file types is controlled using HTTP headers. The headers are meta data (data about data) sent from the server to the browser and visible to all the network hardware in-between. To tell the world it has permission to cache our pages and to share that cache between users, we set the following header:

Cache-Control: public, max-age=30

Here, we have also set a time limit: the maximum amount of time the cache should reuse this page for, in seconds. For this example, I have set it to 30 seconds. Explore the Further Reading boxout on the left for more resources on setting cache times.

By setting the page to public , the user’s browser (and any hardware along the way) will keep a copy. So the first page load will make a request, but all page loads after that will reuse the original response, until the time limit is reached.

The effect of network hardware along the way can be profound. Many large networks (such as ISPs) will have a cache shared between users. Mobile operators also use this technique heavily — for example, to cache and recompress images served over 3G. Site operators can also place a HTTP cache in front of their service. This is what we have done at the BBC.

Cache static assets for ages

A technique we use a lot at the BBC is to treat static assets (like images, CSS and scripts) differently to how we treat pages. Cached items are identified using the URL. There are ways to ‘revalidate’ cached content, but for the simplest case one URL means one cache entry, so caching HTML pages for too long can result in users missing content updates.

However, we can take advantage of this behaviour when it comes to static assets. At the BBC we send all static assets with a maximum age of 31,536,000 seconds set in the cache header. This ensures the assets are cached for 365 days. In effect, assets are only requested once. This is good for performance, but bad for flexibility as changes to that asset will take a long time to get to the user.

In order to work around this, every time we release a new version of a page, we change the URL where the assets are kept. This trick means that new changes are put in front of users immediately, but we still get the same performance benefits.

BBC iPlayer stores static assets for a year — alterations to the URL ensures users see new versions promptly

Client-side variation

As we observed before, putting the right content in front of the user is key. An example from the BBC would be showing signed-in users their user name on each page. People don’t visit the BBC to view their user name, so it’s not priority content.

If implemented server-side, this variation would be terrible, as every signed-in user would have unique pages missing the cache. Instead, we give all signed-in users a single page, then swap the user name into place client-side. This is a good example of progressive enhancement being used to aid performance. It’s a subtle technique, but it gives a huge performance boost.

Final words

In this article we have looked into using caching in order to enhance website performance. The enhanced performance will in turn lower operating costs for our websites and preserve our users’ flow, leading to a great user experience.

Further Reading

There are plenty of articles out there to help you expand your knowledge on caching techniques. These are some of the best, to get you started:

A great guide for going deeper into the way cache headers work and how to apply them to your project.

A good example of using caching controls headers with PHP.

Rachel Andrew provides a great step-by-step guide to get you started with HTTP caching.

A detailed look at other forms of caching available to developers, targeting more modern use cases like offline apps.

A deep dive into how we have used varnish caching and CDN failover with client-side variation.

Jamie is a senior accessibility specialist at BBC Future Media, and former frontend lead for iPlayer Radio. He is slightly autistic and has a plushie companion called Lion


This article originally appeared in issue 279 of net magazine