Driving user growth with performance improvements
Sam Meder, Vadim Antonov & Jeff Chang | Pinterest engineers, Growth
In early 2015 Pinterest engineers ran an experiment that improved mobile web home landing page performance by 60 percent and mobile signup conversion rate by 40 percent. However, the experiment was a hacky solution that used a lot of shortcuts like serving pre-rendered HTML pages without using any internal template rendering engines or common resources (JS, CSS). To productionize learnings from this experiment, the entire front end engine, all page templates and common elements had to be rewritten. It was a huge effort, and to achieve it, we needed to start from building robust metrics to track our progress for all parts of the serving system. In this post, we’ll cover how we improved performance on Pinterest pages, and how it led to the biggest increase in user acquisition of 2016.
As a first step we needed to clearly define and implement metrics we wanted to improve. The original metric used for the 2015 experiment was overall page load time (PLT) which is defined as time from the user typing in a URL or clicking on a URL and the whole page being rendered. In terms of browser navigation timing API, it’s a difference between navigationStart and domComplete events:
- navigationStart event is initiated by a user clicking a link or hitting enter after typing a URL in a browser navigation bar.
This metric is a good start but has a major drawback: it doesn’t reflect performance. This matters to the user since the visible part of the page could be loaded significantly faster than the entire page.
User perceived wait time
To solve this issue we introduced another metric–user perceived wait time (UPWT)–which is defined as time from the user typing in a URL or clicking on a URL and the part of the page visible to the user being rendered. This is a custom metric based on image loading events. We track which images are on the screen and when they’ve finished loading. UPWT would start at navigationStart event and end somewhere between domLoading and domComplete events:
- domLoading event is fired when the browser receives the entire document and starts to render it.
As an additional benefit, a similar metric could be introduced to mobile applications and measured in the same way.
We’ve integrated both metrics (overall PLT and UPWT) as well as some extra performance metrics (e.g. server side performance and more detailed browser side performance) into key company dashboards and our experimentation framework. This enables us to track progress and quickly understand which improvements lead to bigger gains.
Learning: Create the right metric to track your progress and focus on big impact changes first.
Optimization opportunities can be split into three major categories: front-end, network and back-end.
Page weight (CSS/JS/Images/HTML)
By looking at synthetic test results we quickly realized that our pages required a lot of bandwidth to load. This is especially problematic in some international markets with older network infrastructures. To solve for this, we started down the path of becoming much more granular about what we load. In the past, we’d often fetch the CSS and JS for the entire site. Now, we’re only fetching the CSS and JS needed for rendering content above the fold, followed by lazy loading other assets after the initial render is done. We also took another look at which images were requested, with an eye on whether they’re needed at all and whether we’re fetching the optimal size. These two optimizations together yielded a 60 percent reduction in the number of bytes needed to display a page.
Our focus on performance coincided with a site-wide migration to React from a homegrown framework. Our team was one of the early adopters of React at Pinterest, and we realized substantial performance gains from its rendering model. With the React framework in place, we saw substantial gains as we went from an uncontrolled, anything can modify the DOM model, to React’s shadow DOM, batched update model.
Early flush/chunked transfer encoding
We optimized the path between client and server by reviewing how pages were being rendered server-side. This eliminated unnecessary buffering, which ensures browsers receive the <head> section of the page early and can start fetching framework level JS & CSS resources in parallel with data fetches and server-side rendering. We were already making use of chunked transfer encoding to send pieces of the page as they finished rendering, but a review of the infrastructure between the service that renders the page and the end user turned up a couple of steps where we were buffering responses rather than streaming them. Eliminating the buffering sped up getting the bytes to the browser and improved page load times.
We made significant improvements on both our transport infrastructure. We introduced multiple layers of caching in our CDN setup, enabled IPv6, switched to higher tiers of service (CDN) and introduced SSL edge termination (DSA) globally.
Parallelize as much as possible
Rendering a page typically requires multiple different pieces of data from different sources. For us, this currently translates to multiple API calls. There’s a natural data dependency graph between these calls that dictates which calls can be made in parallel and which need to be sequential due to these data dependencies. We’re working towards adopting GraphQL which would automate the optimal parallelization of data fetches. In the meantime, we’re reviewing our current call graph to ensure we parallelize calls that are unnecessarily sequential.
Only return what’s needed
We tailored the data we request to exactly what’s needed by the UI. This saves both on network overhead and eliminates unnecessary fetches on the server-side since additional fields often require additional calls to backend services.
Cache where possible
We spent some time expanding our “edge” caching of data for page-types with low cardinality (i.e. where the number of pages is on the order of hundreds of thousands rather than billions). Caching is an area we’re going to explore further, from caching just “head” page data for the case where we have too many pages of a given type to effectively cache otherwise, to triggering cache refreshes in the background.
Maximizing growth gains with performance improvements
When rewriting web pages for performance, it’s important to not try out a new design. If a faster, different design page is compared to the original page, it’s impossible to know if conversion changes are due to performance improvements or design improvements. Build the same exact page. Also, in order to fully understand the impact of performance on a web app, the experiment should be set up with the ability to segment metrics by page type as well as web vs. mobile web. Different pages receive different conversions and traffic gains from performance increases. For us, aggregating all the pages showed that overall conversions were slightly up, but looking into the segments showed that desktop web conversions was up a lot while mobile web was actually down, lowering the average. We investigated why mobile web conversion metrics were down and discovered a few issues in feature parity.
In order to maximize the overall page improvement, it’s important to be so careful that even small conversion features are reimplemented. Our original pages had a lot of these features, and as we continued to find discrepancies and fix them, our conversion rate continued to go up. The big learning here is to segment pages by page type and web/mobile web to better understand where gains come from and to detect issues with any particular segment. These issues might be masked when looking at the overall aggregated conversion rate change.
Conversions feature checklist
- Identical upsell mechanics
- Navigation mechanics (Popup? New tab?)
- Signup and form mechanics (Field validation messages, same fields and steps)
- Auto-authentication features
- Mobile web and tablet app app upsells
- Mobile web deeplinking
Another important thing to do with a performance rewrite is run an SEO experiment on every page type. For more information about the basics of SEO experiments, check out our previous post, Demystifying SEO with experiments. SEO experiments show if page load time improvements actually result in more traffic from search engines, and in our case, it showed that it did. If your page is a highly trafficked page, chances are you also have implemented a bunch of features that improve search engine ranking. An SEO experiment will also show if some features weren’t properly reimplemented. Even small details like image sizes or the HTML tags used can matter, so it’s important to monitor this for all page types. For us, it took a few weeks of identifying and fixing discrepancies to get our SEO traffic on par.
SEO Feature Checklist
- Major tags (eg. <h1>, hreflang, rel=canonical)
- Identical image sizes
- Descriptive text
- Amount of content on first page load
- Build a nearly identical page, do not redesign the page
- Segment the experiment into different page types and segment web and mobile web
- Run an SEO experiment also
- Look into each segment to see if there are any features that are missing that may be decreasing the conversion rate or SEO
Results & future
The result of rebuilding our pages for performance led to a 40 percent decrease in Pinner wait time, a 15 percent increase in SEO traffic and a 15 percent increase in conversion rate to signup. Because the traffic and conversion rate increases are multiplicative, this was a huge win for us in terms of web and app signups. In 2016, this was our team’s biggest user acquisition win. In addition, Pinners with slow internet connections got a significantly better experience. Because of this project, the team now focuses on performance as one of the biggest opportunities for new user growth.
Other Pinterest Growth Engineering Blog Posts: