Last month, I got to attend Opticon 2015, the largest conference on optimizing and testing, hosted by Optimizely. Pier 27, nestled between the San Francisco Bay and Coit Tower, was a gorgeous setting for two days of networking, hammock naps, nitro margaritas — oh, and gathering knowledge to take back to work. Dozens of break-out sessions, a keynote from Optimizely’s CEO Dan Siroker, and a fireside chat with Marc Andreessen combined for an inspirational event with a LOT of takeaways to chew on. We sorted through them and distilled them down to seven major lessons.
A/B testing can be a big scary thing if you’re just getting started with optimizing your website. Start with these basics to lay the groundwork, and keep them in mind when things get complicated.
1. Before you start, nail down your most important goals.
It’s pretty important to decide what success is before you start your experiment. Sit down with the stakeholders for your site and have a brainstorming session of what you’re trying to accomplish. Chances are your list will be pretty long, but you need to pare it down and decide on what the 1–3 primary goals are before you conceptualize any experiments. In addition, every goal you set for an experiment needs to be clear and actionable. ‘Make the page better’ or ‘help people find what they’re searching for’ are not clear, actionable goals. A good hypothesis has a variation (the B to your A) and clearly defined metrics.
2. Think Big.
Time and time again at Opticon, we heard how the biggest problem with A/B testing is complexity. There are so many things on any site to test, and it can be tempting to think up a million tests you could do (blue v. orange header! page title variations!) Stop right there! Take a step back, and think about the big picture for your site. One of the best use cases for A/B testing is site redesign — test the alternative design against the original and, based off results, hone in on details as you progress. Don’t get stuck on the small stuff right off the bat.
3. Think Simple.
Ask yourself these questions before running your experiment: What am I trying to show with my idea? What key metrics should it drive? Are all of my goals and variations necessary to achieve that? There’s a statistical reason to keep it simple: error rates are affected by the number of variations running at a time, and the more tests you have, the longer it can take to determine a winner/loser.
4. You’re probably wrong.
Repeat this to yourself: What I think doesn’t matter, and what you think doesn’t matter. A challenge faced across organizations large and small is establishing a testing culture. A testing culture means that people buy into the idea that tests can and should be used to make decisions backed up by data, not opinions — putting ego aside. Don’t let the HIPPO (Highest Paid Person’s Opinion) determine what wins or loses.
5. You will fail, a lot.
Just as you’re getting excited about all the cool stuff you can test, we have bad news: most tests end in a statistical tie (whomp whomp). How do you move forward after your big idea ends in a null result? Look at segments to spurn new hypotheses. Overall it might be a tie, but maybe one segment like mobile or existing customers could have determined a winner/loser, and you can dig into that a little further to develop new hypotheses.
6. Use qualitative feedback to find hypotheses.
Best practices of optimization say that you should use both qualitative and quantitative tools. A qualitative tool like exit surveys or user videos can help you step outside the corporate mindset and hear what your real users think — it can alert you to areas for improvement on your site that you’ve been glazing over. If you hear a problem surface a few times during this, create a hypothesis to test a solution and deploy it.
7. Use a reliable stats platform: it’s more complicated than you think.
In his keynote, Dan Siroker introduced Optimizely’s new stats engine, and it is seriously cool. But diving into the details of statistics used in A/B testing brought up the pitfalls of traditional statistics — watch out for these and make sure you understand how your testing platform functions. Here are three things you need to know before using a traditional stats test (the T-test):
- The sample size needs to be set before the experiment starts, even though you may be missing information at that stage. If it’s too high, you may miss an effect; if it’s too low, the experiments take too long. To correct this, Optimizely’s new stats engine calculates the number of “visitors remaining” that it would take to validate the currently-observed lift so you know how long to keep running an experiment before calling it.
- Don’t peek. Every time you check on how an experiment is performing, you make a decision on whether or not to continue running it, and this increases the likelihood of seeing a false positive.
-You might be mistaking False Positive Rate for False Discovery Rate. To calculate your False Positive Rate, you need to know the total number of A/B test combinations (variations multiplied by goals multiplied by pages) — something that’s difficult to account for in testing frameworks (and something that the new stats engine takes care of).
Every department should be able to answer five specific questions. Don’t worry, Wave Analytics can help. Find out more by downloading the free e-book.