Demystifying SEO with experiments

Published in

Pinterest Engineering Blog

6 min readJan 27, 2015

Julie Ahn | Pinterest engineer, Growth

Search engine optimization (SEO) has been one of the biggest drivers of growth for Pinterest. However, it wasn’t always easy to find winning strategies at our scale. Traditionally, SEO tactics include trying out different known strategies and hoping for the best. You might have a good traffic day or a bad traffic day and not know what really triggered it, which often makes people think of SEO as magic rather than engineering.

Our SEO goal is to help billions of internet users discover Pinterest and find value in it as a visual bookmarking tool. Over time we’ve found the only way to verify if a change affects the user behavior positively is to run an A/B test. Unfortunately, we didn’t have similar tools to test search engine behavior, so we built an experiment framework and used it to turn “magic” into deterministic science.

Building an SEO experiment framework

With this framework, we wanted to build an experimentation tool that allows us to accurately measure the overall effect any content modification has on SEO and, more importantly, overall user growth.

The experiment tool consists of these three independent components:

Configuration to define experiments and group ranges
Daily data job to compute traffic to the pages in each experiment group (we measure traffic by the number of unique sessions referred by search engines to Pinterest pages)
Dashboard to view results

Unlike A/B experiments that are typically done by segmenting users, our SEO experiment framework segments pages. For example, in an experiment that has 50 percent of its pages grouped in “enabled” and the other 50 percent grouped in “control,” a page will fall into one of the groups depending on its URL:

‘enabled’ if hash(experiment_name + page_url) in enabled_group_range

‘control’ if hash(experiment_name + page_url) in control_group_range

Hashing the page URL combined with the experiment name ensures even distribution of pages among experiment groups and enables us to simultaneously run multiple experiments with varying group sizes.

Once the experiment is up and running, we can measure the performance of each group by comparing traffic.

As Figure 2 illustrates, the traffic between the two groups wasn’t the same even before the launch of the experiment, with the enabled group getting slightly more traffic than the control group. That’s because some pages are more popular than others, so no matter how we distribute the pages differences in traffic between groups may remain. To normalize this discrepancy and the weekly fluctuations, we re-plot this data with the difference between the two groups as Figure 3. Now it’s easier to see the traffic to the enabled group improved after the launch.

To measure the effect of the experiments, we can compare the averages of the difference before and after the experiment launch. The distance between the two dotted lines on Figure 3 is the average gain or loss of this experiment.

What we learned

There are hundreds of different ways to do SEO, including sitemaps, link-building, search-engine-friendly site design and so on. The best strategy for successful SEO can differ by product, by page and even by season. Identifying what works best for each case helps us move fast with limited resources. By running a large number of experiments, we found some well-known strategies for SEO didn’t work for us, while certain tactics we weren’t confident about worked like a charm.

For example, we once noticed that Google Webmaster Tool detected too many duplicate title tags on our board pages. The title tags on board pages were set to be “{board_name} on Pinterest,” and there are many boards created by different users with the same names. The page title tag is known to be an important factor for SEO, so we wondered if keeping the title tags unique would increase traffic. We ran an experiment to reduce duplicate title tags by including Pin counts in the title tag, for instance “{board_name} on Pinterest | ({number} Pins).” But we found that there was no statistically significant change in traffic between the groups.

So, what worked well on Pinterest? Pinterest is filled with tens of billions of Pins, but the text descriptions can be lacking. We assumed that providing better text on our pages would solve this problem, so we started with a relatively simple approach to improve the text descriptions. Pins are added by many different users, and some are more descriptive than others. For many Pins, we picked a better description from other Pins that contained the same image and showed it in addition to the existing description. The experiment results were much better than we expected (remember Figure 3?) which motivated us to invest more in text descriptions using sophisticated technologies, such as visual analysis. A series of follow up experiments resulted in a nearly 30 percent increase in traffic last year.

Making data-driven decisions

In most cases we found that the impact of an experiment on traffic starts to show as early as a couple of days after launch. The difference between groups continues to grow for a week or two until it becomes steady. Knowing this not only helps us ship successful experiments sooner, but also helps us turn off failing experiments as early as possible. We learned this the hard way.

Figure 4 shows the result of an experiment we ran to test rendering with JavaScript for better web performance. The purpose of this experiment was to make sure this change didn’t impact the SEO negatively. We noticed a slight drop in traffic on day two but decided to wait a few more days because we thought it might take more time for the crawlers to adjust. Unfortunately, there was no recovery in traffic and we ended up turning off this experiment a couple of days too late. Even after we turned off the experiment, it took almost a month for pages in the enabled group to recover from the traffic drop. This was unfortunate, but now we know how and when to make decisions for similar experiments in the future. Additionally, we learned the importance of running experiments for non-SEO projects to prove that they aren’t negatively affecting SEO.

The experiment above might be a failed one, but running that experiment was a huge success in the sense that it prevented a terrible traffic drop without a known cause. We generally run SEO experiments for all major changes that impact unauth pages. In particular, major layout changes, modifications to heading tags, new signup modals and rendering content via JavaScript must be tested with SEO experiment framework.

What’s next?

We’re still in the early stage of expanding the scope of SEO experiments. We plan to improve the existing framework to allow more segmentation and filtering options to meet the needs of different experiments.

If you’re interested solving SEO challenges, join our team!

Julie Ahn is a software engineer on the Growth team.

For Pinterest engineering news and updates, follow our engineering Pinterest, Facebook and Twitter. Interested in joining the team? Check out our Careers site.

Demystifying SEO with experiments

Building an SEO experiment framework

What we learned

Making data-driven decisions

What’s next?

Written by Pinterest Engineering