How we Used PPC Ad Data to Design Successful Organic Search Snippet Experiments

Hamlet Batista, the CEO of RankSense, outlines that many organic search snippets are straightforward descriptions of the page content. Can organic CTRs be improved by taking a page out of your Adwords playbook? His research says yes

In a recent study released on Searchengineland.com, Brian Wood from homewares ecommerce platform Wayfair.com argues that paid search ads don’t help inform title experiments, basing the main part of his argument on claims that different types of users click on paid search ads than on organic listings.

I was surprised by these conclusions, as they are contrary to findings in my own company’s research. My team has been conducting similar experiments, and using paid search ads to guide our search snippets with considerable success.

Here is one example of a successful meta description experiment from one of our B2B e-commerce clients.

Let me explain some important differences between our methodologies, which should hopefully clear up this contentious issue:

Title vs meta description experiments

Title tag changes don’t just affect click through rates, they also cause ranking shifts. If we only want to learn which message will increase click through rate with the same rankings, we need to use meta descriptions because they won’t cause re-rankings.

It is possible the Wayfair title experiments caused pages to lose important rankings. Any page can rank for hundreds of keywords, not just a few. Losing a good number of them would reduce the SEO traffic a page receives.

It’s important to note that, like Wayfair, our study didn’t test individual titles or meta descriptions. Instead, we used title or meta description templates, which allowed us to test groups of pages as one. For example:

Order <product name> and get free shipping, not Order blue widget and get free shipping

We used this approach because traffic to a single page can vary widely, but when you measure traffic to a group of similar pages, the traffic behaves more predictably.

Artificial users vs real users

A key difference between our methodologies is the Wayfair team’s assumption that artificial and incentivized Mechanical Turk users’ behavior would mirror real potential customers.

Most direct response marketers know that ads are designed to be compelling only to their specific target audience. For example, a business that sells luxury goods rarely uses discounts in their messaging because it would not resonate with their target audience. They would highlight exclusivity and first-class experience. As such, Mechanical Turk users may not resonate to Wayfair ads in the same way as their usual target customers who are interested in buying furniture.

Instead of using three approaches to test our messaging, we used only one: the SEO experiment. We didn’t see the need to create new PPC experiments, instead we looked at the Adwords ads of a B2B e-commerce client. We pulled the existing ads with the best CTRs that had at least ten clicks and transactions and we used those ads to find ideas that seemed to resonate with potential buyers, and selected two to test as templates.

Importance of the buyer’s journey

Different kinds of pages appeal to different types of potential customers depending on which stage of the buyer’s journey they’re in.

People who are in the awareness stage tend to know the need or problem that they have, but not the best solution. Their searches tend to lead them to informational pages such as blog posts, infographics, and FAQ pages.

In the consideration stage, potential customers are evaluating different solutions, so they may be deciding what type of product they need or which business to utilize. For example, take someone has recently become motivated to adopt a healthier lifestyle. At first, they don’t know exactly what changes they want to make, and will probably search for fairly generic terms like “exercise routines” or “healthy diet.”

As they move into the decision stage, potential customers begin to account for a variety of new factors such as price, quality, and guarantees — factors typically addressed in promotional advertising. Continuing the previous example, the new health enthusiast has decided they want to try yoga. In the consideration stage, they begin searching for yoga gear, DVDs, or classes.

When a potential customer lands on product pages from organic searches, they are likely near the end of the buyer’s journey. Promotional meta descriptions offering a good deal, a guarantee, or an assurance of quality are more likely to resonate with customers in this stage.

We don’t subscribe to the notion that there is a group of potential buyers that clicks on ads, and another that clicks on organic snippets. We believe there are potential customers that respond to benefits that resonate with them, whether that is in paid search snippets or organic search snippets. Put simply, if the message speaks to the potential customers’ needs and wishes, they respond by clicking. Our results support the notion that more promotional language will appeal to decision-stage customers.

Statistical analysis

Conducting accurate experiments to validate winning combinations is more challenging in e-commerce sites where there are many factors at play, including seasonal fluctuations in sales. Fortunately, Etsy recently shared a very robust scientific framework in this post which offers insights into this issue.

In short, the Etsy methodology is as follows:

  1. Pull all pages from Google Analytics to get the organic search traffic
  2. Normalize the list to include only canonical pages
  3. Exclude pages tagged as no index and short-lived pages
  4. Select pages of the same type (i.e. only products)
  5. Order the pages by search traffic and assign them to ntile groups
  6. Use stratified sampling to randomly assign each page to a test group
  7. Run t-tests and visualize the data in covariance plots to ensure that the differences between the groups are not statistically significant
  8. Assign a treatment to each test group
  9. Once the changes are in place, pull SEO traffic data from Google Analytics for all the groups, and use Causal Impact to figure out the winning treatment (more about this below)
  10. Use A-A testing to ensure any differences between the control groups are not statistically significant
  11. Rollout the winning treatment to the other samples to ensure an increase in search traffic and a decrease in variance between groups

In the case of the client from our first sample on page one, we created 5 statistically similar sample groups covering around 30% of the product pages and also created 5 statistically similar groups for around 30% of the category pages.

Experiments of this type have been reported for product pages but not category pages. We realize that categories tend to define a broader subject and may not provide results as interesting as products. We applied the Etsy methodology to the 5 product groups and the 5 category groups.

This is a covariance plot of the product sample groups (step 7 above). They’re statistically similar enough to use for SEO tests.

As recommended by Bill Ulammandakh in the Etsy post, we used more than one control group.

The recommendation was to use two control groups, however we decided to use three instead.

The remaining two groups were used to test the meta description ideas. One advantage we see in this approach over a traditional A/B test on half the population, is that when the experiments perform worse, negative impact is minimized. By adding an additional control group, we further minimize this negative impact. For example, it is possible that the three test groups will perform much worse than two control groups which could cause a site to lose a lot of search traffic during the testing period. It also helps to ensure that your results are due to treatment effects and not the fluctuation of views for your control groups.

Accelerating the results

In about three weeks, we were able to assess which of our experiments had been the most successful.

In the case of the product pages, both promotional messages performed better than the meta description copies carefully written by the copywriting team, by a considerable amount. Our changes outperformed all control groups. As the plot above shows, one of the promotional messages had a cumulative increase in traffic of 51.7% over the control.

Adding promotional messaging to the categories didn’t provide measurable lift. As we explained above, category page visitors are in the consideration stage, and messaging about the quality of the products or a good selection may resonate better. We aim to run category page experiments with those ideas next. It is important to consider that e-commerce sites have fewer category pages than product pages, so it’s difficult to generate a large sample size.

In order to get results fast, we used Fetch as a Googlebot and submitted all the pages manually to the index. It helped that we could reach statistical significance with only a few hundred pages using the framework we learned from Etsy.

One of the motivations behind the Wayfair study was to find a faster method to perform SEO experiments. Their SEO experiments took 60 days to get results, so our project only taking three weeks was a substantial improvement.

Measuring Success

Figuring out whether an SEO experiment is successful is challenging because of any factor outside our control can affect the search traffic. As recommended in the Etsy blog, we use Google’s Causal Impact package. You can learn more about it and its effectiveness here.

In a nutshell, it is a sophisticated statistical approach which allows users to compare the actual traffic (orange line in our plot) against a “virtual control” (dotted blue line), which is forecasted based on any kind of controlled behavior. In our case, we created our control and test groups from similar data. They offer a very easy-to-use R package if you are familiar with the programming language.

Users can also find a detailed technical explanation on how the Causal Impact methodology compares with more traditional ones like difference-in-differences here.

To summarize our findings, paid search ads are very useful when you want insights and ideas about which organic search snippet experiments to perform. If you want to test click through rate increases, you need to start with meta description experiments as those won’t cause rerankings.

Strong benefit-driven marketing copy in search snippets generally performs better than plain descriptive copy because it speaks to the needs of potential customers.

This article was originally published on MarTech Advisor

Show your support

Clapping shows how much you appreciated Shabana Arora’s story.