Towards a Faster Dictionary: Part 1

Adam Traver
Dictionary Engineering
5 min readNov 28, 2018

Dictionary.com was founded in May of 1995. As web technologies have come out over the last 20+ years— including JavaScript, which was released in Netscape Navigator 2.0 in September of the same year! — we’ve evolved to take advantage of the benefits they offer. And, in recent months, we’ve made a number of improvements to address both maintainability and performance. As we’ve already discussed the design of our new tech stack, I’d like to talk a bit about how we’ve tackled overall site performance.

Where to Begin

The question isn’t “what are we going to do,” the question is “what aren’t we going to do?” — Ferris Bueller’s Day Off

Because site performance is impacted by so many systems, we started a small task force with members across a couple of teams. This gave us the ability to have input from a range of experiences, from HTML response improvements to CDN optimization.

It was important to get everyone on the same page in terms of how we were going to measure success, so we didn’t waste time trying to improve things that weren’t going to be tracked by stakeholders. So, we looked at our page speed metrics using tools like PageSpeed Insights and Google Analytics and came up with a reasonable goal to reduce our page load time, measured in both First Contentful Paint (FCP) and DOM Content Loaded (DCL).

With the metrics we wanted to track in hand, we still needed to know where we could realize the biggest gains. With so much written about web performance in recent years, it can be tough to figure out what will give you the biggest bang for your buck.

Let the Browser Do its Job

Web browsers are incredibly powerful at this point, but they’re still not able to fully predict what we’re trying to do without some help. One of the main things we strove for during this process was letting the browser do what it does best, giving it any help it needed. With that in mind, here are just a few of the optimizations we’ve delivered.

HTTP/2

Even though it’s a relatively new revision to HTTP, HTTP/2 vendor support is getting pretty good. We rely on a CDN to serve most of our traffic on top of our platform stack at Amazon. Fortunately, both of those services support HTTP/2 without the need to enable it at our application level, while still delivering almost all of the benefit (except Server Push, which Amazon doesn’t yet support).

HTML served using the “h2” (HTTP/2) protocol

With HTTP/2 (as well as our recent migration to HTTPS), not only are we advertising our site to web browsers as a good web citizen, we can benefit from a couple of performance improvements HTTP/1.1 didn’t offer:

  • Multiplexing: Fully multiplexed requests and responses allow us to serve all our content from a single domain, as opposed to sharding it by domain. We’re not all the way there yet, but HTTP/2 allows us to work towards this.
  • Header compression: Request and response headers can get bloated — especially with large cookies — and any compression we can add to them helps reduce time over the wire.

Deferred JavaScript

We found quite a large performance boost in adding the venerable defer attribute to our <script> tags. We considered using the newer async attribute instead, but not being able to control the order our scripts are loaded in, as well as the issue that asynchronously loaded scripts can block DOM rendering made us opt for defer.

Utilizing the “defer” attribute

While neither of these attributes moves JavaScript execution after DCL, removing the previously synchronous behavior of script execution from our page made the rest of the page load considerably faster and allowed the browser to do its job more efficiently.

Caching Static Assets Effectively

As I mentioned, we utilize a CDN to cache our pages and assets. However, one thing we — as well as PageSpeed Insights — noticed early on was that while we were caching our assets at the edge, our CDN was clobbering our origins’ Cache-Control headers and adding its own.

Caching hashed assets effectively indefinitely

We regenerate our JavaScript bundles every deployment with uniquely hashed filenames, so the CDN taking assets that can — and should — be cached for an indefinite period of time (since a new deployment will simply create entirely new files) and dropping their effective cache to 2 hours was painful. Fortunately, issues like these are easily fixed through CDN configuration, and we were back to letting the browsers do the work of keeping our assets on disk for subsequent accesses.

Preconnecting to Known Domains

“The speed of light sucks.” — John Carmack

When we migrated our site to HTTPS earlier this year, we noticed an increase in connection times due to the additional TLS negotiation required to support HTTPS. Add that to the latency incurred when connecting to a server, DNS lookups, and TCP handshakes, and you could be staring at 150ms just to connect to a single domain.

Keeping in the theme of letting the browser do its job, we looked into Resource Hints, specifically the rel="preconnect" attribute for <link> tags. On any given page, we know we’re going to connect to about 6 different third-party domains at some point in the page’s lifecycle (both during HTML parsing and subsequent JavaScript execution). With that knowledge, we were able to add these preconnect hints to our page and let the connection, DNS lookup, TCP handshake and TLS negotiation execute for these domains asynchronously, before the resources were requested by the browser.

Next Steps

Performance, like security, is an ongoing concern. As we completed each of the above tasks (as well as a few others), we monitored our site’s performance for new hot spots: When we’d finish a task, performance would improve, and something else would rise to the top to become the next target.

We’re working on ways to keep performance in the conversation, by investigating ways to measure it through our CI/CD pipeline as well as through continuous monitoring solutions set up with proper thresholds and alerting.

We’ll be back soon to talk more about how our front-end React infrastructure is being optimized to support our performance goals.

--

--