Website Performance: Understanding the Basics
Crucial for website performance is the critical rendering path, this article shines a light on browser rendering and provides tools and strategies to optimize for it.
Originally published at fabianstiehle.com
Certain resources on websites are render blocking, the browser can’t paint the website without downloading and evaluating these resources. On many websites, the browser spends precious first-page-load time downloading and evaluating resources not necessarily needed for the rendering of the initial page. For good performing websites we need to optimize for fast rendering. This article will give you an overview of all the tools and methods required to optimize the rendering path, without getting caught up in implementation details.
But before we dig deeper into optimizing the render path — it is crucial to understand how a browser actually renders a website.
Browser Rendering on an abstract level
Before anything else, the browser parses the HTML and creates the document-object-model, usually abbreviated as DOM. The DOM is structured as a tree-structure and essentially is an abstract representation of all valid html nodes, visible or not.
The DOM doesn’t contain any information regarding the visual representation of it’s elements, this is handled by another vital tree-structure: The CSSOM. As most of you might guess, CSSOM is an abbreviation for cascading-stylesheets-object-model and contains all valid CSS styling information from both internal and external stylesheets.
From the DOM and CSSOM the Rendertree/Frametree is constructed. It depicts what actually needs to be rendered, so only visible HTML nodes, no head or hidden elements. At this point the CSSOM is matched against the DOM and attached to the matching element in the Frametree.
When the frametree is constructed, the browser can proceed by calculating the coordinates for each element on the screen. This process is called Layouting, as the browser is laying-out the elements on the screen.
Now the webpage can finally be painted on the screen, and this is quite literally a painting process, which is actually handled by the GPU.
As we notice, rendering a webpage is a very sequential process. It bears a lot of potential bottlenecks, with one being very obvious: The Browser needs both the CSS and the DOM before rendering can occur.
In modern browsers prefetching is employed, as soon as the parser is blocked, the browser actually looks for other resources it might have to load further down the page, but no DOM manipulation or parsing occurs during this time, it only starts loading these other resources.
Browser rendering in the real world
Let’s look at a waterfall diagram:
This is wordpress.com as seen in Chromes network tab. Now let’s prove the above explained. The timeline diagram expands to the right until first page render. I also cut out unnecessary information.
Optimizing the rendering path
Before we get to the real magic we will look at some general optimizations for page rendering.
Stylesheets in <head>
Stylesheets should be supplied as early as possible. The browser defers rendering until the CSSOM is completely landed and evaluated, so download of CSS files should start as early as possible.
Synchronous scripts to the bottom of the page
Minimizing http requests
Splitting up your files to make them all load in parallel sounds logical at first, but browsers enforce a limit for concurrent downloads per domain, which is usually between 6–8 concurrent connections.
Also consider the delay vs. the actual download time, a http 1 request is hugely expensive.
Blue depicts the actual download time, green the delay until the first byte is received. This is dramatic for smaller files. So always consider bundling them up or inlining smaller scripts and styles. For statics images — such as icons — consider using sprites. This will also free the download queue and optimize your parallel downloads.
Optimizing for concurrent downloads
As just discussed, the browsers limit on concurrent downloads is a point to consider. It might be advantageous to make sure that the first few files being downloaded are the ones actually required to render the page.
A general recommendation is to utilize „domain-sharding“. Domain-Sharding is a technic to load resources from different (sub-)domains. We might want to load images from images.domain.com and scripts and CSS from the original domain.com. This way we outmaneuver the limit on concurrent connections opposed by browsers.
You also want to look into Gziping your files. (Not covered in this article since it’s a server-side improvement.)
Loading CSS asynchronously
The only thing we can do to eliminate render blocking CSS resources is to load them asynchronously. We need the means to tell the Browser that the file is not mandatory for rendering but that it should still be downloaded.
Of course there is a part of our CSS we don’t want to load asynchronously, the CSS that is actually important for the layout of our site. But we might identify some CSS code we don’t need to execute immediately.
Of course I’m talking about separating your CSS in critical and non-critical CSS.
Loading the critical css first and less important parts later can achieve a progressive rendering of the page.
- above the fold elements
- media queries that apply to the viewport
Non critical CSS
- Beyond the fold elements
- media queries for other viewports
- icon fonts
Let’s look at the different ways to load CSS asynchronously.
the media attribute
With the media attribute the browser downloads CSS files that don’t match the current condition on „low priority“ (ultimately depends on the browser) and does not evaluate them. They’re still being loaded though, just later, and evaluated when conditions match.
So in cases of big CSS files, it might be worth splitting the file up. An immediate and easy start would be the conditional loading of media queries.
the preload keyword
A more realistic use-case is a CSS file which you would want to always download asynchronously, in this case you can make use of the new preload keyword.
onload fires when the file is downloaded, then we just change the rel tag to „stylesheet“ — signaling the browser to evaluate it.
I highly recommend using loadCSS as a fallback for older browser which don’t support preload yet.
loadCSS actually gives the link element a media condition that will never apply, therefore the browser loads it asynchronously. After the file is downloaded the condition is erased and the CSS file is evaluated.
Web fonts are now in use on pretty much every website and deserve an extra paragraph. The browser usually delays loading of text until the font defined in the CSS file is loaded. This can be a major problem since it essentially means your text is waiting for a http request to finish. From this we can derive some immediate rules:
Fonts should have a “fallback” in case the http request fails, this is called „font-stacks“
This is easy from a technological standpoint, we define multiple fonts in our CSS as follows:
If the first is not present, the second gets used and so on. I said this is easy from a technological standpoint, not so much from a design perspective since you would want to define fonts that share certain similarities. You can read more about font stacks here.
Fonts should be loaded asynchronously
This is hard. As soon as the font family is set in your CSS file the browser delays the painting — also ignoring our fallback fonts — since it knows the font is there, just not jet loaded. An immediate solution comes to mind: Set the font family when the font file is finished loading. This can actually be achieved with Typekit’s web font loader.
There is a problem with this solution though: The change of a CSS attribute forces the browser to re-render — at least partially. (Reflow/Re-rendering won’t be discussed in this article, since it would blow the already huge boundaries of this article.) Ultimately a re-rendering of probably the body element, or whichever CSS element contains your font declaration — usually an element high up in the CSSOM trunk — forces at least all the text but likely your entire website to re-render. From the perspective of user experience re-rendering can range from awkward to horrendous depending on your implementation. However it is done, user will experience a flicker of some sort.
The topic of font loading deserves another article, but I wanted to shine a light on the problems and predicaments of it. Ultimately though, the approach that most websites choose, is more a topic of caching than page loading.
Async scripts are completely detached from the parsing process, they may download and execute at any point. (If you’re a seasoned programmer, this might already ring some warning bells. )
We’ve already established the best practice for including our scripts at the bottom of the page. But wouldn’t it make sense to put the async scripts in the head, right after our CSS? It would get downloaded early and therefore execute earlier but without any blocking.
Well, yes, but not all browsers support the async keyword just yet, which would make the script in the head render blocking. Decide for yourself.
What is doing async for us when the script is at the bottom anyway?
The Browser encounters the script at the bottom, most Browsers are already able to render most of the DOM at this point. It loads the script and executes when ready. Ready will almost certainly be after the parsing is complete also — where are the benefits?
Well there is more to consider: we’ve already looked at the preload mechanism in modern browsers, since the async signals the file may be executed whenever it’s loaded, it may be loaded by preloading and execute while parsing is still blocked a bit further up the page. This is a huge deal since parsing will most definitely be blocked at some point.
Asynchronous Execution is bearing some challenges. Important to consider:
- Async can be executed in any particular order, depending on which script first loads, a racecondition may occur.
In some cases we would rather defer a script, meaning it’s being executed after the page is fully parsed.
There is already a wide supported keyword to accomplish this:
- Make sure to place them in the right order when you have dependencies
Defer vs. Async
Since there is still some controversy whether to use async or defer: Async is more performant, since it allows execution to be completely decoupled from page parsing, wich also means it may execute without the CSSOM being constructed and when discovered and loaded by preload.
But rather than seeing defer and async as a competing means, both are great tools in different use cases.
Most developers struggle to make page manipulating async scripts work. Keep in mind the async script makes no guarantee about execution order! Sometimes defer might be a better option.
What you might want to defer or async:
- Almost any form of third party scripts
- Animations, bells and whistles
Defer image loading
Images are not render blocking resources, but we’ve already discussed the limit on concurrent connections to one domain. That’s why it’s usually recommended to load assets like images from separate domains, preferably through a CDN. The reality though is that most websites load their resources from the same domain. And sometimes — for image heavy websites — this problem is just not solvable.
Lets take a look on an extreme example to showcase the above described.
We can see the bulk of downloads and chromes limit to six concurrent downloads.
Here comes „lazy-loading“. With lazy loading we load images after first page load, they’re not taking up any http connection, until DOMContentLoaded, this proves to be especially valuable for „under the fold“ content. Images you won’t see without scrolling anyway may as well be loaded later.
Not all of your images may be viable for lazy loading. Prominent Images in your „above the fold“ content should load as fast as possible to improve perceived speed.
A good approach is to give your images width and height attributes, so the browser knows the dimensions early on and the content box won’t jump after the size of the image is calculated.
In general — lazy loading or not — it’s a good idea to apply placeholders for the not-yet-loaded images. This can be as easy as:
Or as seen on other webpages, loading a low-quality placeholder first.
The AMP project takes a similar approach with images and resources in general:
„The runtime may choose to delay or prioritize resource loading based on the viewport position, system resources, connection bandwidth, or other factors.“ (From the amp-img tag documentation.)
The tools and methods I just described are the solid basis of high speed websites, there are other important factors, such as caching or layout trashing, but none of this will help when the initial rendering is not optimal and thus first page load is slow.
Website performance is a highly discussed and very interesting topic, technical advancements constantly allow more means for advancing this field. Please let me now if you’ve additions or corrections to this article.
Originally published at fabianstiehle.com.