A/B tests and copy — what, why, how

Published in

Booking.com — UX Writing

8 min readAug 6, 2018

What happens when cold hard data and soft fluffy words collide

A/B testing. It’s one of the things Booking.com has built its reputation on, and it’s a huge part of the day-to-day job of anyone who works in the Tech department.

Before I started working at Booking.com I’d been involved in a few A/B tests, but nothing prepared me for the scale of Booking.com’s test culture. And especially how testing copy is a major component of it.

Looking back (through piles of data-based strategies and heaps of statistical analysis) it’s funny I was so surprised. After all, words can majorly influence how you perceive things. Whether you enjoy an article, fall in love with a book, or nurse a burning hatred for a website, it often rests entirely on how the words sold it to you.

So, Booking.com does a lot of testing, built on a beast of an A/B tool that makes running tests quick, painless and (reasonably) simple for people without any background in testing itself.

But when it’s this easy, how do Booking.com’s UX copywriters decide why, when and what to test?

Why A/B test copy at all?

A copywriter knows what copy is good and what copy is bad and gets it perfect every time.

Right?

Well, apart from that not being quite true, A/B testing is the democratisation of decision-making. It means the death of the Highest Paid Person’s Opinion, and the chance to let our users decide what matters to them. You’re asking the opinion of your users, without the need for an immersion-breaking survey.

But it’s not just throwing word-spaghetti at a wall and seeing what sticks. Copy tests need a solid hypothesis to work off, too.

Remember your school science lessons? Put it like this:

By doing {x}, I will prove {y}.

It seems odd and reductive to boil words down to hard science. But X and Y can be anything.

‘By front-loading the important information in this header, I will prove that it’s easier for users to find their booking information.’

‘By adding people’s names to email subject lines, I will prove that it encourages them to open their emails.’

We have to build a hypothesis for what we’re changing, why, and the result we expect to see. If you’ve got a solid hypothesis, your test explains itself, and it becomes easier to make a decision at the end of your test — did you prove it, yes or no.

(And I say ‘easier’ not ‘easy’ for a reason — we’ll discuss why later).

So, you’ve made the decision to test your website’s copy. But what are you trying to achieve by doing it?

Challenge your assumptions

That thing you’re completely certain of. How certain are you that it’s true?

Things you know about your user, things you know about your website, things you know about copywriting.

A/B testing is a way to challenge your ideas and prove (or even disprove) your assumptions, and ensure you’re addressing the user in the right way at the right moment. You can use this to:

Build a strategy

To know where you’re going, you need to know where to start.

A/B testing is the bedrock of copywriting at Booking.com, whether we’re using past tests to create data-based strategies, or using testing to push strategies further forward. Using A/B testing alongside a strategy can be your guiding force and sanity check — if you go in throwing ideas everywhere without a solid base under your feet it’ll be hard to find a through-line of results that couldn’t be attributed to statistical anomalies or coincidences.

Test your tone of voice

So, that shiny new company tone of voice. What are you going to do with it? Do you know how it’s going to be implemented? The beauty of the A/B test is that it’s a commitment-free way of asking the opinion of your users.

Stress-testing your tone of voice is essential. Your test didn’t go exactly to plan? Make some changes, adjust how the ToV is used, go back in. Didn’t work at all? Maybe the ToV needs to be revisited using the new data, revising your assumptions about what you ‘know’ about your users.

Create a user profile

Who’s looking at your copy? What’s your audience like? Put your copy to the test and build up your understanding of your users and gain solid, usable insights into exactly what your users like.

And from these tests, you can expand outwards. Say you proved that your audience really likes playful, casual copy; keep testing that theory elsewhere with copy, but also consider if the design matches your tone. Maybe a future test could involve a designer to ensure design and words are working together.

Where do you go from here?

Getting fast insights into your users can contribute to the work of everyone around you — whatever area you work in. And from this you can end up as the person in your area with the best insights into customer behaviour, likes, dislikes, what speaks to them and what doesn’t.

Finding the B to your A

So you’ve decided on why you want to A/B test. What happens now?

At Booking.com, we test everything and anything. No copy is so good it can’t be touched, and you’d be surprised at how very minor copy changes can make your site that much better for your users.

A/B testing copy in page headers, titles and email subject lines are popular targets, and can give you the most bang for your buck, traffic-wise. But before you start planning your tests, bear in mind that:

It’s no good for large-scale changes. A/B tests rely on small, incremental changes. If you change copy in multiple places at once, you’ll never be able to identify exactly what copy is causing the effect you’re seeing in the data.

The data is dumb. So people need to be smart. A/B testing is just a way of collecting data, it can’t tell you exactly why something happened. Maybe you see that changing a subject line in your abandoned cart email increased open rates. This is where you use your human brain to turn the data into actual insights. Was it the change in tone? Frontloading the user name? Maybe that you made it shorter? Or maybe all three? From this data, you can build a theory, then go back to test that theory.

Go for impact. If you’re planning on running a test on your site, especially when you’re just starting out, go for broke with traffic. Getting enough data from visitors that you’ll be able to conclusively prove your results can be a real challenge, and only gets worse if you run tests on some of your pages that rarely see the light of digital day. Bigger numbers mean clearer results.

It never ends. Keep testing, even when it hurts. The instinct might be, whatever the result, to declare the test over and and never go back to try again. Remember, there’s always more to learn.

What’s your evidence?

This theory that you want to test. What made you think it?

Previous A/B tests, your company’s user research, maybe an interesting piece of data in an article you read online.

Making sure that your idea has some kind of basis in reality will reassure you (and everyone around you) that you’re actually working in the right direction.

What’s your proof of success? What would be your proof of failure?

Choosing what the proof and ‘anti-proof’, so to speak, of your success will be what guides your decisions and can ensure that you write the copy to meet your projected outcome.

Start out thinking about the user problem, rather than coming up with copy to suit a problem that may or may not exist.

Anecdotal evidence is still evidence (e.g. my Mum hates this email subject line) and a good place to start, but back it up with statistical evidence (e.g. my Mum hates this email subject line and we see a lot of unsubscribe messages from this email).

Having evidence will be the foundation your hypothesis sits on, and your hypothesis is what you should always end up comparing your results to.

Measuring success

Booking.com has a gigantic, home-grown solution to all our A/B needs, but that doesn’t mean you need to set out right now and build your own testing beast. Whatever A/B testing tool you use, we all start out testing by selecting the metrics that will prove whether our hypothesis was correct or not.

A popular metric is, naturally, conversion. And there’s nothing wrong with running tests just for that purpose, but at Booking.com we always look at more granular metrics to check minute user interactions. A copy test on a CTA button that increases conversion is no good if it also confuses the user into buying the wrong thing, right?

Seeing the many specific impacts of your change on the user can help you visualise exactly what your users do on your site.

Interpreting the results

You run your A/B test. What then?

You step back, and look long and hard at the results.

You need to completely understand the data before you make your final decision. Did you see the impact on your metrics you expected? Is there something you didn’t expect?

An important thing to bear in mind is you need to avoid chasing statistical ghosts — whether it’s just a metric that coincidentally changes (a common side effect of low traffic) or trying to reverse engineer your hypothesis to fit the new facts.

If you do decide to introduce your change to all traffic, then congratulations! This isn’t your cue to dust off your hands, call it a day and never think about this test again, of course. Examine what you did with this copy change. Maybe you tried a different tone of voice. Where else could you introduce it? If you tried changing a headline to be more direct, would other areas benefit? From here, you can start building your understanding and continue your testing journey.

Failure is most definitely an option

As we saw, there might be occasions where you might not want to introduce your change fully because you’ve spotted a bad result in your data. From there, it’s almost the same outcome as a straight success — examining what you did, what changed, building a theory about why this happened and then continuing to test to support this new idea.

Your learnings (good and bad) are just as important as your success. So the best thing you can take away from A/B testing? Never be afraid to fail.

So maybe your test was terrible for your main metric. Or maybe it was perfect but caused an unexpected bad side effect somewhere else. Those aren’t failures. You only failed the test if you didn’t learn anything.

As long as you can take learnings and use them in future tests, strategies or profiles, every test you run will be successful.

We’re always on the hunt for new writing talent. Wanna join us? Apply here.