How we optimized our Next.js app for web performance

Published in

Typeform's Engineering Blog

9 min readNov 4, 2022

This blog post is a story about a series of improvements. Mostly full of joy, but also with a bit of letdown (learning).

When I joined Typeform, I became a member of a team caring for our beautifully crafted homepage. It had a decent conversion rate after a few rounds of AB tests.

Something still worried me. Everything was loading blazing fast on the development machines powered by M1 processors. But I had doubts that all customers owned one.

A quick look at our Lighthouse scores revealed the truth. It was time to roll up the sleeves.

First findings, what is the size of the video?

And then Lighthouse has spoken — “Avoid enormous network payloads.”

What did he mean exactly? In the content “above the fold,” we served the video spanning the entire screen. The file size was big, so after typing “how to optimize video” in Google, I picked one of the top results and uploaded our files to VideoSmaller.

The name of the service held its promise! Optimizing around 65–85% without visible quality loss with the exact resolution. Good start.

Why is our JS bundle so big?

Nobody knew. A big JS bundle is usually a result of continuous feature development. Time flies, and the amount of shipped code grows.

When looking at the JS bundle size, we should summon Webpack bundle analyzer to the rescue!

With this Next.js plugin, I discovered that one of the culprits was an incorrect import to our design system. Instead of importing only the footer component, we were pulling everything. That small change shaved our bundle from 320 to 250KB of gzipped code!

What if our bundle grows again? There was a way to prevent that and increase awareness among the people working on the codebase. I will add more about that later.

You got gzip compression; how about a better one?

By default, Next.js uses gzip to compress assets served to the browser. It magically reduces files up to 70% of their original size. But there is a better way. It’s called Brotli and offers up to 21% better compression than gzip.

For our web pages, we use AWS Cloudfront as our content delivery network. To implement a better compression algorithm, we had to turn off server one and allow this service to reach uncompressed files. By changing the Cloudfront configuration, we serve all our assets with improved compression. Not only does it improve the download time for our customers, but it also reduces infrastructure costs. Our network serves smaller files, while CDN charges for every GB.

This change, however, created a little problem for local development. When we turned off the compression on the node.js server, we started seeing big file sizes. To mimic our production setup, we introduced an environment variable to turn on compression only in development.

That was still a big win with just a tiny complexity increase.

Core Web Vitals, how fonts impact them, and the first mistake

Every resource loaded on a page can impact different performance metrics. We consider them to reflect how our users perceive the page. Performance metrics could also impact your SEO. Search engines like Google use those metrics in their search rankings algorithm. A flawed metric could put you down in search results.

One of those metrics is the Largest Contentful Paint (LCP), the meaningful part that catches visitors’ attention. On the homepage, this should be the top left headline section.

I assumed that loading smaller fonts would improve the font loading for our main headline.

I found this brilliant article about font subsets, and how to optimize fonts. Instead of loading all the font characters, you slice your font into parts that will be the most likely used, like only English characters. Other symbols land in separate files that will be requested if needed. It led me to use zx running pyftsubset and then some .scss mixins to create a font stylesheet from the new font ranges. After the first test, the page sent requests only for font files with symbols currently used on the page.

This optimization sliced the total font size by around 50%.

Unfortunately, it was hard to say if it was a success. I was not aware that something else on our page was impacting the same metric I wanted to improve. This prevented me from conducting an AB experiment, where 50% of our visitors received an optimized version of the font and the others did not.

What was the problem? Our cookie banner and how it loaded.

Cookie banners loads after a few seconds, hiding everything on the page.

The font optimization did not bring the expected results because the banner we loaded arrived on the page a few seconds later than the headline and stole all user attention. This event was visible in the metrics, and the AB test showed no conclusion.

Why was working on the font optimization a mistake at the time?

The answer to this question is a quote from the Theory of Constraints:

“Any improvements made anywhere besides the bottleneck are an illusion.”

We could bring multiple isolated improvements to our page, but if we did not fix the biggest problem, they might not have any measurable impact. It was nice to decrease the font file size, but the time was not right.

I parked this topic and proceeded to the next one. A couple of months later we came back to it, created our own cookie banner window, and improved the LCP metric from 1–3 seconds, depending on the page.

Automate the awareness

We needed a more organized approach to fight the Time To Interactive (TTI) metric. It measures the time until the page is fully interactive, which means that nothing is taking over the main JS thread. Our two biggest enemies were script evaluation and runtime. JavaScript library sizes contributed to evaluation time. If one developer would change one library for a smaller one, how do we know that someone else would not reverse this change in the future? Assigning someone that would be solely responsible for page performance was not efficient. Adding automation to pull requests was.

Say hello to the Next.js Bundle analysis action!

The visualized bundle size change was caused by moving away from the getInitialProps method in _app.js to getServerSideProps used on every page. The first one provided a way to fetch common data for every page. The second one allowed us the efficient tree shaking of data fetching code. We did not ship it to the client code anymore.

Thanks to pull-request data, every engineer could make an informed decision about the library choice or become alerted to unexpected changes.

“There is no silver bullet that’s going to fix that. No, we are going to have to use a lot of lead bullets.”

These words came from Bill Turpin when he discussed their web server performance with Ben Horowitz from Netscape. When I looked again at our bundle analysis, I had the same thoughts.

The next idea was to replace some JS libraries or remove them entirely.

We started with the biggest one. Framer-motion is an animation library, and its size was half of our whole design system. I started with a search for possible replacements via an entry in Bundlephobia. Later I looked at how we used it within our components. I realized, in most cases, it was implemented along some simple transition animations, fully replaceable by pure CSS. Two more complex uses required the usage of React Transition Group, which detected component appearance and disappearance. This change also helped to stabilize our visual tests. We could finally disable all animations easily and make the static screenshot.

The next spotted area was our translation files. Our previous implementation loaded everything all at once. Single JS object with languages fields importing JSON with all translations. The search on how to properly divide the translations into even smaller chunks brought us to the next-translate package. We liked the API, which helped load only the JSON files we expected on the page. We also switched from loading all languages to a solution where we only load the active one.

A few other adjustments were moving away from lodash modules to micro-dash and ditching react-responsive for react-use, which was part of our app anyway. This change saved us maybe around 10KB. Not much on its own, but still counts toward a total.

Solving problems with Next.js dynamic imports

Dynamic imports are a great feature. They allow you to branch your code and load only components used on the page.

Some admin apps may benefit from client-side loading, but our situation involved server-side rendering. How is that different? For SEO purposes, we still wanted to serve the full-page HTML, but with dynamic imports, all code JS chunks used to render the page is appended to the document to load asynchronously. React can then hydrate all components on the page to make them interactive.

What we wanted to achieve was Partial or Progressive Hydration. There is no reason to serve code that is not yet needed. If something lies below our browser window, why do we load all the JS for it?

Currently, Next.js does not have this capability built-in, so I had to look elsewhere. The search and some trials and errors led me to the next-lazy-hydrate.

I carefully extracted components that were usually below the fold and moved them to separate files.

Then the whole homepage rendering after changes looked like this:

Looks too good to be true? Yes, it involves a small trade-off. If the lazy-hydrated component uses Context, when the component loads, the original HTML from the server rendering will be unmounted, and a new one will mount again. This mechanism results in a short blink. Fortunately, we can mitigate it by setting a higher offset, like “start loading it when we are 300px from”.

The same trick, with lazy hydration, was nicely used in the footer content, which is quite heavy and impacts every page. It helped to save another 35KB. It was also fun to discover that our language switcher was tightly-coupled with the footer. When we did not load the modules within the footer, they attached themselves to a single dropdown with the languages during compilation. Decoupling the two modules sorted everything out.

Render blocking resource — font stylesheet

Our page was already preloading font files, and at some point, we had to describe all the font families via a stylesheet. Typeform shares one default CSS file across all apps. That means that typeform.com has to connect to a different domain font.typeform.com and request the file.

This potential idea for improvement was brought to our team by one of the SEO experts, who searched for a way to decrease the LCP metric further. We analyzed the solutions and suggested inlining the font stylesheet in the HTML file, skipping the whole connection stage to the distinct domain.

Inlining external font stylesheet in HTML document

For mobile devices, where latency is higher than on desktop, it might cut the loading time by 0.5s. It was possible because once the prepared HTML file is cached, we make no additional request to reach the contents of the stylesheets.

Summary

After all these changes, we went down from an initial 370KB to 109KB of the JavaScript code loaded for every page, reducing the total blocking time by an average of 30%.

Although I wrote this story from my perspective, I would be nowhere without our team fellows. Their knowledge of our infrastructure, focus, creativity, and dedication made it possible to introduce all those changes, for which I am very thankful. We moved through obstacles together and learned a lot along the way. This journey took us almost a year while working in different site areas. Barcelona was not built in a day!

There are still many opportunities to continue this story, and I hope that one day we will reach the magical lighthouse score of 100.

What is this all for if not to provide a pleasant site experience?