Increasing experimental power with variance reduction at the BBC

This article discusses how the Experimentation team have been accounting for pre-experiment variance in order to increase the statistical power of their experiments

Frank Hopkins
BBC Data Science

--

Introduction

In the Experimentation and Optimisation team at the BBC we have the privilege to be working with very large sample sizes, which means we can confidently determine how our variants have performed in a controlled experiment. As the BBC continues to iterate and optimise, changes can produce smaller effect sizes and determining whether changes are statistically significant or otherwise becomes more difficult. In this instance the demand for greater volumes of traffic also increases.

This challenge of detecting small differences to statistical significance relates to correspondingly low statistical power. When an experiment is underpowered, statistical analysis will likely return non-significant findings between your experimental conditions, regardless of whether a true and meaningful effect actually exists. In addition, low statistical power is exhibited when the treatment effect is too small in relation to the variance of the metric that is being assessed (for more see here). Given that within-group variance is so high on our streaming platforms — whereby we may have super-users who view hundreds of pieces of content a month and others who just come for the odd…

--

--

Frank Hopkins
BBC Data Science

Experimentation Data Scientist, specialising in digital experimentation. Posts ranging from data science to website optimisation and digi-analytics.