Website Performance: Understanding the Basics

Crucial for website performance is the critical rendering path, this article shines a light on browser rendering and provides tools and strategies to optimize for it.

Originally published at fabianstiehle.com

Certain resources on websites are render blocking, the browser can’t paint the website without downloading and evaluating these resources. On many websites, the browser spends precious first-page-load time downloading and evaluating resources not necessarily needed for the rendering of the initial page. For good performing websites we need to optimize for fast rendering. This article will give you an overview of all the tools and methods required to optimize the rendering path, without getting caught up in implementation details.

But before we dig deeper into optimizing the render path — it is crucial to understand how a browser actually renders a website.

Browser Rendering on an abstract level

Before anything else, the browser parses the HTML and creates the document-object-model, usually abbreviated as DOM. The DOM is structured as a tree-structure and essentially is an abstract representation of all valid html nodes, visible or not.

The DOM doesn’t contain any information regarding the visual representation of it’s elements, this is handled by another vital tree-structure: The CSSOM. As most of you might guess, CSSOM is an abbreviation for cascading-stylesheets-object-model and contains all valid CSS styling information from both internal and external stylesheets.

From the DOM and CSSOM the Rendertree/Frametree is constructed. It depicts what actually needs to be rendered, so only visible HTML nodes, no head or hidden elements. At this point the CSSOM is matched against the DOM and attached to the matching element in the Frametree.

When the frametree is constructed, the browser can proceed by calculating the coordinates for each element on the screen. This process is called Layouting, as the browser is laying-out the elements on the screen.

Now the webpage can finally be painted on the screen, and this is quite literally a painting process, which is actually handled by the GPU.

As we notice, rendering a webpage is a very sequential process. It bears a lot of potential bottlenecks, with one being very obvious: The Browser needs both the CSS and the DOM before rendering can occur.

Almost all fronted performance challenges are derived as an implication of this process, though to fully understand the broad picture we also need to look at JavaScript.

Effects of JavaScript on the rendering of a webpage

Javascript is loaded and evaluated synchronously. So if the browser encounters Javascript it holds the page parsing, downloads and executes the file. This leads us back to our initial problem, when Javascript holds the page parsing, it effectively holds the DOM from being completed.

Javascript as such is not a render blocking resource, but parser blocking — which as a result leads to render blocking.

JavaScript has another peculiarity, JavaScript itself is dependent on the CSSOM. JavaScript won’t execute without the CSSOM being fully constructed, the browser must assume that JavaScript may access/modify CSS properties and therefore delays execution.

Resource prefetching

In modern browsers prefetching is employed, as soon as the parser is blocked, the browser actually looks for other resources it might have to load further down the page, but no DOM manipulation or parsing occurs during this time, it only starts loading these other resources.

Browser rendering in the real world

We conclude that HTML, CSS and most JavaScript Resources (we will discuss exceptions later) must be loaded and evaluated. Therefore these resources are render blocking.

Let’s look at a waterfall diagram:

Network waterfall of wordpress.com

This is wordpress.com as seen in Chromes network tab. Now let’s prove the above explained. The timeline diagram expands to the right until first page render. I also cut out unnecessary information.

Look at the entry “wordpress.com” which I marked green. It’s the actual HTML file and is downloaded after ~300ms. But as you can see, also green marked, the DOMContentLoaded event at the bottom of the diagram is fired after ~700ms, still lightning fast, but much later than the HTML is loaded. In this timeframe, the parsing happens, the parser initiates downloads for files it found while parsing the HTML file. In this case mainly JavaScript files and a external font file. Interesting to know: The browser usually defers downloading of the font (here .wof files) until shortly before the first render.

The font file, which is just a external CSS file and the JavaScript files are all finished downloading before rendering. Notice the JPG “pic_crowd” marked in orange, the loading is initiated through JavaScript, but the rendering occurs before the image is fully loaded. Images are not render-blocking.

Optimizing the rendering path

Before we get to the real magic we will look at some general optimizations for page rendering.

Stylesheets in <head>

Stylesheets should be supplied as early as possible. The browser defers rendering until the CSSOM is completely landed and evaluated, so download of CSS files should start as early as possible.

Synchronous scripts to the bottom of the page

JavaScript gets downloaded and evaluated immediately when discovered, during this whole process, rendering is blocked. By putting these files at the end of the page we allow the DOM to be constructed earlier. Careful though, For JavaScript that heavily contributes to the page content you may want to consider loading them earlier, since perceived speed might improve when execution occurs earlier.

Minimizing http requests

Splitting up your files to make them all load in parallel sounds logical at first, but browsers enforce a limit for concurrent downloads per domain, which is usually between 6–8 concurrent connections.

Also consider the delay vs. the actual download time, a http 1 request is hugely expensive.

HTTP 1 Request

Blue depicts the actual download time, green the delay until the first byte is received. This is dramatic for smaller files. So always consider bundling them up or inlining smaller scripts and styles. For statics images — such as icons — consider using sprites. This will also free the download queue and optimize your parallel downloads.

Optimizing for concurrent downloads

As just discussed, the browsers limit on concurrent downloads is a point to consider. It might be advantageous to make sure that the first few files being downloaded are the ones actually required to render the page.

A general recommendation is to utilize „domain-sharding“. Domain-Sharding is a technic to load resources from different (sub-)domains. We might want to load images from images.domain.com and scripts and CSS from the original domain.com. This way we outmaneuver the limit on concurrent connections opposed by browsers.

Minimizing files

This is a no-brainer and you certainly have already heard of it — and yet many sites just don’t do it. The effect of minifying, especially CSS and JavaScript files, is huge. There are countless tools to automate this task by integrating it in your build process. My general advice would be to minify anything, also HTML — especially since you’re going to inline scripts and CSS into your HTML. A Tipp: Look for minifying tools that not only strip whitespace but also minify function names and general syntax along with deleting redundancies.

You also want to look into Gziping your files. (Not covered in this article since it’s a server-side improvement.)

Loading CSS asynchronously

The only thing we can do to eliminate render blocking CSS resources is to load them asynchronously. We need the means to tell the Browser that the file is not mandatory for rendering but that it should still be downloaded.

Of course there is a part of our CSS we don’t want to load asynchronously, the CSS that is actually important for the layout of our site. But we might identify some CSS code we don’t need to execute immediately.

Of course I’m talking about separating your CSS in critical and non-critical CSS.

Loading the critical css first and less important parts later can achieve a progressive rendering of the page.

Critical CSS

  • layout
  • above the fold elements
  • media queries that apply to the viewport

Non critical CSS

  • Beyond the fold elements
  • media queries for other viewports
  • icon fonts

Let’s look at the different ways to load CSS asynchronously.

the media attribute

With the media attribute the browser downloads CSS files that don’t match the current condition on „low priority“ (ultimately depends on the browser) and does not evaluate them. They’re still being loaded though, just later, and evaluated when conditions match.

So in cases of big CSS files, it might be worth splitting the file up. An immediate and easy start would be the conditional loading of media queries.

the preload keyword

A more realistic use-case is a CSS file which you would want to always download asynchronously, in this case you can make use of the new preload keyword.

<link rel="preload">

In order to make this work we need to utilize the onload handler and call some JavaScript.

onload fires when the file is downloaded, then we just change the rel tag to „stylesheet“ — signaling the browser to evaluate it.

The first to play around with this were the guys from Filament Group, the same who wrote loadCSS, a really clever piece of JavaScript which makes asynchronous loading possible even without the preload tag.

I highly recommend using loadCSS as a fallback for older browser which don’t support preload yet.

loadCSS actually gives the link element a media condition that will never apply, therefore the browser loads it asynchronously. After the file is downloaded the condition is erased and the CSS file is evaluated.

Font loading

Web fonts are now in use on pretty much every website and deserve an extra paragraph. The browser usually delays loading of text until the font defined in the CSS file is loaded. This can be a major problem since it essentially means your text is waiting for a http request to finish. From this we can derive some immediate rules:

Fonts should have a “fallback” in case the http request fails, this is called „font-stacks“

This is easy from a technological standpoint, we define multiple fonts in our CSS as follows:

If the first is not present, the second gets used and so on. I said this is easy from a technological standpoint, not so much from a design perspective since you would want to define fonts that share certain similarities. You can read more about font stacks here.

Fonts should be loaded asynchronously

This is hard. As soon as the font family is set in your CSS file the browser delays the painting — also ignoring our fallback fonts — since it knows the font is there, just not jet loaded. An immediate solution comes to mind: Set the font family when the font file is finished loading. This can actually be achieved with Typekit’s web font loader.

There is a problem with this solution though: The change of a CSS attribute forces the browser to re-render — at least partially. (Reflow/Re-rendering won’t be discussed in this article, since it would blow the already huge boundaries of this article.) Ultimately a re-rendering of probably the body element, or whichever CSS element contains your font declaration — usually an element high up in the CSSOM trunk — forces at least all the text but likely your entire website to re-render. From the perspective of user experience re-rendering can range from awkward to horrendous depending on your implementation. However it is done, user will experience a flicker of some sort.

The topic of font loading deserves another article, but I wanted to shine a light on the problems and predicaments of it. Ultimately though, the approach that most websites choose, is more a topic of caching than page loading.

There is a all-solving solution though: use default system fonts. That’s right!
Let the OS handle the font, like native apps would do and Github does.

Delayed or Asynchronous Javascript

Asynchronous Javascript

We’ve already established that it’s best practice to include JavaScript at the bottom of the page. With placing JavaScript at the bottom, we don’t stop the browser from blocking, the browser just might be able to already render the page up to this point.

Let’s take more control. With defining a script as async, we completely eliminate render-blocking. The JavaScript gets downloaded and executed alongside the DOM parsing.

Async scripts are completely detached from the parsing process, they may download and execute at any point. (If you’re a seasoned programmer, this might already ring some warning bells. )

We’ve already established the best practice for including our scripts at the bottom of the page. But wouldn’t it make sense to put the async scripts in the head, right after our CSS? It would get downloaded early and therefore execute earlier but without any blocking.

Well, yes, but not all browsers support the async keyword just yet, which would make the script in the head render blocking. Decide for yourself.

What is doing async for us when the script is at the bottom anyway?

The Browser encounters the script at the bottom, most Browsers are already able to render most of the DOM at this point. It loads the script and executes when ready. Ready will almost certainly be after the parsing is complete also — where are the benefits?

Well there is more to consider: we’ve already looked at the preload mechanism in modern browsers, since the async signals the file may be executed whenever it’s loaded, it may be loaded by preloading and execute while parsing is still blocked a bit further up the page. This is a huge deal since parsing will most definitely be blocked at some point.

Asynchronous Execution is bearing some challenges. Important to consider:

  • Async can be executed in any particular order, depending on which script first loads, a racecondition may occur.
  • Many of the Javascript manipulation depends on an already fully or partially constructed DOM, this might not be the case when the script is called by a preload event for example. The JavaScript might want to access HTML elements that are not yet parsed.

Defer Javascript

In some cases we would rather defer a script, meaning it’s being executed after the page is fully parsed.

There is already a wide supported keyword to accomplish this:

There are some things to keep in mind with deferred JavaScript:

  • Make sure to place them in the right order when you have dependencies
  • Keep in mind that the Javascript is only executed when the Page is rendered, this state equals to the JQuery $(document).ready() function.

Defer vs. Async

Since there is still some controversy whether to use async or defer: Async is more performant, since it allows execution to be completely decoupled from page parsing, wich also means it may execute without the CSSOM being constructed and when discovered and loaded by preload.

But rather than seeing defer and async as a competing means, both are great tools in different use cases.

Most developers struggle to make page manipulating async scripts work. Keep in mind the async script makes no guarantee about execution order! Sometimes defer might be a better option.

What you might want to defer or async:

  • Almost any form of third party scripts
  • Analytics
  • Advertisement
  • Animations, bells and whistles

Defer image loading

Images are not render blocking resources, but we’ve already discussed the limit on concurrent connections to one domain. That’s why it’s usually recommended to load assets like images from separate domains, preferably through a CDN. The reality though is that most websites load their resources from the same domain. And sometimes — for image heavy websites — this problem is just not solvable.

Lets take a look on an extreme example to showcase the above described.

We can see the bulk of downloads and chromes limit to six concurrent downloads.

Here comes „lazy-loading“. With lazy loading we load images after first page load, they’re not taking up any http connection, until DOMContentLoaded, this proves to be especially valuable for „under the fold“ content. Images you won’t see without scrolling anyway may as well be loaded later.

Not all of your images may be viable for lazy loading. Prominent Images in your „above the fold“ content should load as fast as possible to improve perceived speed.

A good approach is to give your images width and height attributes, so the browser knows the dimensions early on and the content box won’t jump after the size of the image is calculated.

In general — lazy loading or not — it’s a good idea to apply placeholders for the not-yet-loaded images. This can be as easy as:

Or as seen on other webpages, loading a low-quality placeholder first.

The AMP project takes a similar approach with images and resources in general:

„The runtime may choose to delay or prioritize resource loading based on the viewport position, system resources, connection bandwidth, or other factors.“ (From the amp-img tag documentation.)

Conclusion

The tools and methods I just described are the solid basis of high speed websites, there are other important factors, such as caching or layout trashing, but none of this will help when the initial rendering is not optimal and thus first page load is slow.

Website performance is a highly discussed and very interesting topic, technical advancements constantly allow more means for advancing this field. Please let me now if you’ve additions or corrections to this article.


Originally published at fabianstiehle.com.