An Unravelling Tale of Performance, Part 1: Analysis
A classic tale of love, loss, and web performance.
We launched our new homepage last month. It’s a massive refresh of our original design. Clean, bold, focused, lots of whitespace.
It is, like most startup output, the product of an MVP mentality. Launch, learn, iterate.
This is a good approach to rapidly arrive at a product that works better for users. But as the feature machine trundles ever-forward, this process can leave “good enough” features in its substantial wake.
We’re finally at a point where concentrating on performance and polish will yield better returns than Yet Another Feature. The new homepage itself has been tweaked to satisfaction across a good balance of users.
The site was originally designed with performance in mind, but after a year of API, design and feature iteration, it’s time for a performance audit.
In this post, I’m going to audit the DriveTribe homepage with online reporting tools, Chrome DevTools and Google Analytics. I’ll outline my findings and formulate a plan, the execution of which we’ll follow in following posts.
It’s easy for us developers to sit here in our silicon towers, browsing with our Macbook Pros and swanky city office internet connections, to mistakenly believe our sites are fast.
This often isn’t true. To get a more objective measure of page performance, we can use a combination of:
- Reporting tools, like Google PageSpeed Insights, to generate high-level reports, suggestions, and a headline meta-score. WebPageTest can test from practically anywhere on Earth and returns a detailed waterfall view of every request.
- Chrome DevTools to leverage connection and CPU throttling with flimstrip capturing to get a feel of what people on slow devices experience.
- Google Analytics, to measure the real effect this has on users. For instance, what is our bounce rate on the homepage?
With appendages firmly crossed, let’s dive in.
The majority of our (human kind’s) search traffic comes from Google, and they take performance into consideration for page ranking. So it seems prudent to use Google PageSpeed Insights and the Google Lighthouse Chrome plugin to generate performance reports.
These reports analyse how fast your site is across desktop and mobile, and make a series of actionable suggestions. It’s a reasonable assumption that making good on these suggestions will contribute to a better PageRank.
I ran the reports, and this was the result:
Both reports cite off-screen images and render-blocking CSS requests as an issue. We only have one stylesheet, but a whole load of images. This is a potential win.
The Lighthouse report suggests our first meaningful paint doesn’t happen for a whopping 9.6 seconds, which doesn’t match my experience. So I ran it again. And lo, the second time we received a time of 3.6 seconds, which is still too slow.
This highlights that you might want to run Lighthouse a few times to get a better sense of the numbers. The perceptual speed index in the image above halved, again still too slow, but its worth bearing in mind the weather.
Both Lighthouse and PageSpeed Insights report it takes ~0.65 seconds for the server to even respond, a measurement known as Time To First Byte (TTFB). When we first launched DriveTribe, this was nearly instantaneous, so it’ll be interesting to see what’s changed there.
I then threw the page to WebPageTest, tenderised raw meat into the maw of a rabid dog. What fresh hell would await?
Oh. Not as bad (at first glance)! TTFB is still a clear problem. LA servers are reporting around 1.2 seconds (hence the F), whereas London servers are predictably reporting 0.65 seconds which resulted in an A.
WebPageTest provides a very clear waterfall diagram of each request. Chrome can also be used for this, but personally I think WebPageTest is clearer.
If we look at our first three requests, we can actually see the massive TTFB as the light blue:
The CSS files, that Google described to us as “blocking”, finish downloading before the page itself does. It would be easy to take this as a suggestion that they’re not blocking in a practical sense, but as pages can render without the full HTML payload that isn’t true.
There was also this:
All those purple bars are image requests, and we know from the previous reports that most of these aren’t needed to render the initial viewport. Also, those numbers run from 21 to 53, which means there are at least 32 totally unnecessary requests.
The good news is that vertical gold line is “DOM interactive”, which means users can still use the page before the images have loaded.
However, sending all these requests isn’t just a matter of performance, it’s a matter of principle. Many users have expensive data plans and if we don’t need to send something, we shouldn’t.
The hard numbers
With Chrome DevTools, I took a look at what was actually going on.
Using device emulation, I set the site to “low-end mobile”, which both throttles the connection to “slow 3G” and the CPU to “vacuum tube”.
The main part of the homepage, the hero banner, wasn’t readable for 6 seconds. Not as bad as I’d feared, frankly, but nowhere near good enough.
Our HTML payload is being reported as a whopping 400kb. I was suspicious of this, so I downloaded and measured the file — it is 72kb. WebPageTest reports the same (underlining a benefit of using multiple tools). Still, there’s more to this than meets the eye, and I believe there’s wins to be had here.
Something that hadn’t been flagged by an reporting tool was the sheer amount of time our main JS payload (225kb gzipped and minified!) took to download: 18 seconds. Being a server-rendered site, you can still click on links and have a usable experience. But some things, like the alerts panel, would remain inoperable for that amount of time.
Part of the point of using React is the ability to reuse DOM, components and data for subsequent page loads. The
main.js file is being thrown away if the user navigates away within that 18 second period. Which means any potential performance benefit we might derive from that capability is completely nuked, and we’ve wasted hundreds of kb of mobile data.
Finally, we make, in total, 56 image requests.
All in all, for slow devices, this is potentially a fucking disaster.
If this theory is true, then I expect to see a drastically higher bounce rate for users on mobile devices than on desktop.
Plan of attack
This analysis leaves me with a bunch of different areas to investigate closer:
1. Render-blocking CSS
Both of Google’s reporting tools highlighted two render-blocking CSS assets. One, perhaps ironically, consists of Google Fonts Open Sans rules, a loading method recommended by Google itself. Maybe there’s a way of requesting this resource without blocking the page from loading.
The other asset is our global stylesheet. Conventional wisdom suggests this needs to be blocking. If you read my previous blog post on migrating to Styled Components you know that we’re getting to a place where only render-critical CSS is loaded.
As the homepage is brand-new, there shouldn’t be many legacy components kicking around that would require the global stylesheet to be render-blocking. We could be ready to take advantage of this.
2. Time to first byte (TTFB)
Our TTFB used to be near-instantaneous, but by all reports it’s now quite high. It’s a matter of debate whether this is a meaningful metric for users, as it doesn’t necessarily mean the page renders quicker. But, there are signs that it correlates to a higher PageRank, making it a worthy subject of optimisation.
A high TTFB might suggest a delay with resolving API requests. Or maybe we’re calling too many endpoints. Are all requests happening in parallel? Do we even need to call all data dependencies from the server or can some be called later?
3. Images and DOM outside the viewport
We naively request and render all the DOM and images needed to render the entire page. However, the user only initially sees whatever’s in the viewport.
A better approach may be to deliver and render only the data and assets needed to render the initial viewport. Or to render everything but only request images when they appear in the viewport.
4. HTML payload
The typical approach to server-side rendering is to embed the data that the server used to render the page as an encoded object in the HTML payload itself.
The downside is a bigger HTML payload. Exactly how much is dependent on how much data you retrieve from your API. In our case, 55 of 72kb is all data, and it’s loaded before the HTML itself.
The thinking here was that we can send the data while the server is busy rendering the page. Originally, this did prove to be faster than sending the HTML first.
It’s worth revisting to ensure the ratio of data to HTML hasn’t shifted so much that this is no longer true, and double check it on slower devices where it could be the network connection, rather than the speed of the server, that’s the bottleneck.
We could also ensure that only the data necessary for the current page is sent. We already strip down the models returned from the API for data that we’ll need in general, and this can speed up subsequent page navigation.
For instance, if we have a feed full of posts, if a user clicks on one we don’t currently need to fetch that data again. But for that initial view, we don’t need all the data returned from the API.
5. JS payload
I remember thinking
main.js was too large at 130kb, and “I’d better keep an eye on that.” We’ve seen how that turned out. I’ll be looking at reducing this by:
- Auditing what this bundle is made of and making sure no large third-party libraries have snuck in.
- Splitting out a vendor JS file.
- Ensuring Webpack is splitting the bundle as much as possible, even if modules are repeated across bundles.
- Introducing build-time tooling that could potentially fail PRs if any one file gets too large.
Using Google’s PageSpeed Insights, Lighthouse, WebPageTest and DevTools, we’ve identified five strong candidates for bagging ourselves some much-needed performance wins.
Considering websites are generally the fusion of CSS, HTML, JS, images and server-generation, I can’t think of an area where we can’t improve. Which means we should see big improvements in the coming weeks.
In subsequent posts, I’m going to investigate each of these potential areas for improvement. I’ll find a potential solution, implement it, and release it. I’ll compare performance before and after and hopefully we’ll discover where the time/effort ratio can yield worthwhile wins.