How A/B Testing Helps Microsoft and Why You Should Consider It Too

Use A/B testing wisely to maximise results

5 min readFeb 15, 2022

A/B testing is an experimental approach to decision-making that could generate significant value for your organisation. Microsoft has been using it since 2006 and now their Experimentation Platform (ExP) is embedded across all their major products.

A/B testing is a controlled experiment where two versions of a variable (e.g. button, algorithm, marketing content) are shown to users who have randomly been split into two groups. Once a sufficient number of observations are recorded in each group they are compared to determine if there is a statistically significant difference. This form of experimentation was popularised by online companies, such as Google’s now-famous experiment of testing 41 shades of blue to see which one performs best.

After reading this article you will understand the benefits of A/B testing, how Microsoft uses it, and some potential pitfalls.

The benefits of A/B testing

A/B testing is a form of causal inference. Causal inference is the study of how treatments affect outcomes of interest. For example, does a change to my recommendation algorithm increase the click-through rate of my product?

Using data to drive decision-making is a big step for many organisations, with as many as two-thirds of CEOs still relying on gut instincts when making decisions. Even if this isn’t the case at your organisation, you could see large benefits from moving beyond correlational analysis.

A study conducted with Microsoft, The Benefits of Controlled Experimentation at Scale, outlined the following advantages of experimentation:

Value discovery: a business can learn what its customers value by experimenting with its product and marketing platform
Lowering product complexity: only one-third of new product features tested on Microsoft’s experimentation system showed noticeable improvement in key metrics, allowing them to avoid wasting resources implementing non-value-adding features
Team activity planning: learnings from experiments can be used to prioritise the strategic direction of personnel
Ensuring quality and capacity: implementing a new feature to a subset of the total user base limits the impact of problems caused by bugs and underestimated resource allocation

With these benefits, it is unsurprising to find that many organisations, not just Microsoft, have embraced A/B testing as part of their core culture.

What organisations are using A/B testing?

Experimentation has become ubiquitous in the technology industry. Software and the internet have facilitated an environment where A/B testing is relatively inexpensive and can be run with millions of users. Everything is now tested, from front-end user-interface changes to back-end algorithms and marketing content and media.

Over one hundred thousand experiments are run annually by thirteen organisations present at the Practical Online Controlled Experiments Summit, including:

Search: Google, Bing, Yandex
Retail: Amazon, eBay, Etsy
Media: Netflix, Amazon
Social networking: Facebook, LinkedIn, Twitter
Travel: Lyft, Uber, Airbnb, Booking.com

To help us understand what sort of experiments these companies are running, let’s look at two examples from Microsoft.

A case study of experimentation at Microsoft

Microsoft has a culture of experimentation. A/B testing is used across its suite of products including Office, Bing and Skype. Their involvement in the study of the benefits of experimentation at scale gives us an idea of the tests that Microsoft is running at any given time.

Case study 1: MS Word contextual commands

Designers at Microsoft hypothesised that adding contextual commands (i.e. the suggested actions shown on the bottom of the screen) to the mobile app will result in more users editing documents on their phone and increase 2-week retention.

Initial results of the A/B test showed an unusual result. Further investigation found that there was a telemetry bug introduced with the release. Thankfully, this bug only affected a small group of users involved in the experiment and was fixed (and validated using A/A testing).

Subsequent A/B testing over two weeks confirmed there was a significant increase in mobile edits but no significant increase in 2-week retention. Nevertheless, the experiment was deemed a success and the feature was prioritised for other MS Office products.

Case study 2: OneNote sharing

The OneNote team hypothesised that allowing users to share a single page of a OneNote notebook, rather than the whole notebook only, would increase the percentage of users sharing.

The A/B test showed that users with this change had a significantly higher share rate. However, further investigation showed that a subset of the treatment group didn’t share this improved rate. Similar to the previous case study, the feature caused a bug that impacted a subset of users on an older OS version. This bug was identified and fixed because of the controlled experiment.

The potential pitfalls of A/B testing

A/B testing can help your organisation generate value as shown by Microsoft. However, running experiments isn’t as simple as rolling out a change and measuring the difference in key metrics. A/B tests that are poorly designed and not analysed correctly could end up costing your organisation.

In general, you shouldn’t run an A/B test when:

You don’t have a large enough sample size
You can’t invest the time
You don’t have an informed hypothesis with key metrics to measure

Even if you meet the above criteria, it’s easy to fail in your experiment if you make any of these eight common mistakes, such as having cross-contamination between groups, and not holding non-treatment variables constant.

Given the complexity of A/B testing, you may be tempted to instead simply make a change and attempt to assess success by measuring your key metrics before and after. If only it was that easy. Without running a controlled experiment, it’s difficult to determine whether an improvement in your key metric was because of your change or some other external variable.

But it’s also possible to go too far with experimentation. Focussing on incremental changes may stifle innovation (think Google’s 41 shades of blue).

Conclusion

Investing in A/B testing, like Microsoft, can benefit your organisation, provided you avoid the common pitfalls. A/B testing will allow you to identify high-value opportunities, plan strategic objectives and ensure a high-quality offering for your customers. Microsoft has benefited from A/B testing by incorporating contextual commands into the MS Word app and adding page sharing to OneNote. However, improperly designed A/B tests or focussing on incremental improvements as opposed to innovation may end up hurting your organisation. Nonetheless, A/B testing done right can help you unlock value by making more data-driven decisions.

Liked what you read? Follow me on Medium.

Otherwise, tweet me or add me on LinkedIn.