Speeding Up NerdWallet 🚗💨
Our recently spun-up Frontend Infrastructure team now has more than 6 months under our belt and site performance has been one of our main focuses.
This post summarizes a handful of the macro level things we’ve done to improve and maintain page load performance of nerdwallet.com.
Huge shout out to the product teams that did the brunt of the work to consume these changes as well as implement their own optimizations on individual parts of the site.
Removing Unused Global CSS/JS
NerdWallet was originally created as a PHP monolith. During my time one of the largest overarching initiatives has been to move from this monolithic stack to microservices mostly built in Python and micro-apps built with Node/React.
The large amount of CSS we used to serve globally was intertwined in these micro-apps and even in our base React components. This effort was tedious tech debt cleanup but here’s the net impact after removing this unused code that was being served globally:
- 50kb (gzipped) of JS (190kb parsed) removed
- 30kb (gzipped) of CSS (200kb parsed) removed
Web Font Optimization
I wrote a full blog post about this so you can read more about it there.
TL;DR: Font optimization is quite complicated nowadays but a concerted effort can have meaningful reduction in time-to-first-paint (in our case, as much as a 30% drop). In our case we are subsetting critical fonts and loading them as soon as possible via
Server Side React Render Cache
When we first migrated to React, one of the must-haves was that we couldn’t sacrifice the risk of losing search ranking by client-side rendering our applications — we had to use server side rendering to produce the HTML needed for easiest GoogleBot consumption.
However the React render can take a non-trivial amount of time. If you have a really large tree, it’s not uncommon for the render function to near 100ms. And 100ms per request spent on a synchronous, blocking call means a maximum of 10 requests per second per instance. That’s not good.
Since React’s render is a pure function, we can memoize based on all the inputs (e.g. redux’s store state and the react-router location). In a cache hit this can be an order of magnitude or more reduction in time to render, improving performance for that page as well as throughput capabilities.
Centralized Build Tool
Webpack configuration can be fairly complex. Since we have dozens of individual web-apps that all run webpack themselves we devised a solution to share webpack configs across all of these applications so they could be centrally managed. We landed on using a light wrapper around Electrode’s webpack-config-composer because it can be expressed as a plain object.
With this tool in place, whenever we make a performance optimization via our webpack config, we just update in one place and the benefits are propagated out to all apps when they upgrade / redeploy.
The above would produce a webpack config with sane defaults to be consumed in a browser, while allowing complete customization, in this case specifying a custom value for the
minChunks option to be passed into
As a part of creating this centralized webpack interface, we upgraded our version of Webpack. One challenge we ran into when upgrading to webpack 3 (this was prior to webpack 4 being released) from webpack 1 is that the dedupe plugin was removed and thus our bundles got bigger.
This was unexpected since we assumed the later version would be better for performance. We ended up rolling our own webpack dedupe plugin to produce the same functionality.
When we update our traffic usage from google analytics, as apps re build/deploy the JS supported will be automatically transpiled to reflect the browsers we support.
We built a React Image component to codify best practices (e.g.
sizes) and support lazy loading to improve perceived performance.
Godspeed those who attempt the server-rendered, code-split apps
This is a real quote from react-router that was live on their site until very recently. Codesplitting a server side rendered app is not as battle tested and only recently has there been solid tooling to do such a thing.
We ended up leveraging the suite of react-universal-component packages to handle CSS and JS based codesplitting. This allows us to achieve CSS and JS codesplitting that will work server side or client side.
We’ve had some challenges around setting this up, codesplitting from within a nested package, and codesplitting / css modules not playing nicely. However this has allowed us to do route-based or component-based codesplitting, and split out things like large visualization libraries to their own bundles that are lazy-loaded.
CDN in front of nerdwallet.com
This was probably the single biggest optimization that improved our page load performance site-wide.
Thanks to a large effort from our DevOps team we put Amazon’s CloudFront in front of all traffic on our entire website. This was a huge win because CloudFront has many data centers from which it operates and our customers are now opening up connections with the closest CloudFront location to them rather than going all the way to nerdwallet.com which is not hosted in nearly as many locations. Additionally, even after the connection has been established, since nerdwallet.com is hosted in Amazon’s S3/ECS, we are able to leverage Amazon’s internal routing rather than the public internet to get content to users as fast as possible. Overall, this helped dropped our site-wide time to first byte by ~20%.
The second large change associated with this is that we were able to consolidate all of our assets to our top level domain. For example, instead of referencing assets on the CDN at
cdn.nerdwallet.com we now can use
www.nerdwallet.com/cdn. This results in faster time to download assets in the critical render path over HTTP2 as we don’t need to handle a new DNS lookup/TCP connection/SSL handshake — the browser can leverage the existing connection opened from the top level domain.
CSS / Fonts / images
<link rel="preload" as="font" type="font/woff2" href="..."/>
For assets in the critical render path impacting SpeedIndex, we leverage the
<head> so the browser can potentially paint as much of the page without having to make many requests in serial.
<script src="..." defer />
defer for our core JS bundles over
defer guarantees order and guarantees the main thread won’t be blocked until the DOMContentLoaded event. In practice we saw
async scripts sometimes being evaluated much earlier than we’d like to see — ahead of image paint or similar.
This required us to not
Building a Culture of Performance
In order to maintain and improve our site performance, we needed to build a culture that values site performance and measures it on a regular basis.
This is probably the biggest change we’ve made to ensure we are on top of our performance. We’ve adopted SpeedCurve and made SpeedIndex one of the key metrics we monitor across all of our frontend teams. If you don’t use SpeedCurve, we highly recommend it.
Vlad Silin, an intern on the FEI team for the past quarter built an awesome tool to provide performance related feedback in the form of a Github comment when developers make PRs. Read more about this in Vlad’s post.
We’ve held performance-related workshops, talks, captured performance data in our data warehouse to correlate page performance to business performance, and in general been advocates where possible to make performance a core part of frontend engineering at NerdWallet.
We still have a long list of things we want to do. Some things we plan to work on in the near future
- Inline CSS under a minimum threshold via Google’s Nginx pagespeed module
- Improve our image processing pipeline — support requiring an image
require('../my-image.png')from within JS and generate responsive image sizes either at build or on the fly
- Service Workers for better offline support
- Upgrade versions of some of our key packages (webpack 4, React 16)