A/B Testing Site Search

Search is the ultimate UI safety net. Any time a user can’t find what they’re looking for, they can at least turn to site search for one last attempt. Almost every site has some form of site search, but it is also one of the most often neglected sources of customer insight.

How do you know search is working for your customers? To answer this question, let’s first look at the 3 key components that make up a search interaction.

1. Presentation

Before anything is typed, users must first locate the search box. This sounds simple, but consider the following site search presentation:

Clicking on the floating “Search…” text in the top right is the only way to focus on the search box with invisible borders. Clicking on the search icon will immediately submit an empty search.

Beyond the appearance of the search box, presentation also affects how each search result is displayed. What information should we present to help the user narrow the results? How do you weigh the tradeoff between high-res product images and the number of items users can see on-screen at once? Are snippets helpful? How about user ratings?

A/B Testing Presentation
Determine the variations you’d like to test, and show each visitor a randomly selected variation. Make sure the selection is persistent for that user so they don’t fluctuate between different experiences (you can track this via a browser cookie).
One place to start is by testing the default display format for your search results. Is it better to show results in rows or tiles? Many search interfaces give you the option to choose the display format, but only a small percentage of users will ever bother to change the default. This is your chance to find out which presentation works best for your customers.

2. Retrieval

After the user sends off their search, this is where the backend magic happens. What results should we return and in what order? Are the right fields being indexed? How do we determine user intent? This is a deep and complex area, and with so many moving parts it’s near impossible to predict how changes in the search algorithm will impact the quality of results and downstream conversion goals.

A/B Testing Retrieval
Want to try out a new search algorithm and make sure it’s moving conversion in the right direction? A/B test it! Split users into test cells and measure performance metrics such as latency, CTR, dwell time on result pages, add to cart rate, or units sold.
Even the perfect list of results won’t be sufficient if it’s accompanied by 10 seconds of waiting. This is why tests are so important; no amount of theory is going to replace hands-on data from real users.

3. Context

Beyond the explicit user query and filter options, there is bound to be implicit context of the user visit. Have they searched for similar things in the past? Should the search results be altered based on their purchase habits?

A/B Testing Context
With any change, it’s important to have a firm grasp of the current baseline performance. Always test experiences in parallel if possible, as comparing performance from different time ranges introduces variables that can make it hard to isolate the true impact of your changes (such as a traffic surge from a marketing campaign).
Personalization can have privacy implications, and users are sometimes given an option to opt out. If you are measuring the performance of personalized vs vanilla search, make sure users who opt out of their test cell are attributed properly to avoid contaminating your data.

Example: Side-by-side search performance comparison by SeaUrchin.IO

You may have noticed a pattern here as old as the scientific method: Form a hypothesis and test your assumptions. But hold on! Unlike other A/B tests you can run with services like Optimizely, search is a whole different beast. When it comes to search, not all queries are created equal. It’s almost certainly the case that your existing system is tuned well for a subset of queries, and performs worse for others.

To truly understand search performance, you need an analytics solution that lets you break down user activity at the query level, for multiple test cells. This is the only way you can make sure improvements you made for a subset of underperforming queries did not come at the expense of your existing top performers.

Example: Top performing & underperforming queries tracked by SeaUrchin.IO

SeaUrchin.IO’s search insights platform gives you the power to test your ideas without manual logging or backend processing requirements. It plugs into any existing search implementation on the web, and gives you real-time insights out-of-the-box in less than 5 minutes. Tracking new search A/B tests is as simple as clicking a button, no code push required.

Want to know how users are using your site search? Find out more at https://seaurchin.io/