Applying Delta Method in A/B Tests Analysis

Ahmad Nur Aziz
4 min readAug 11, 2021

--

Hi! This is my first article in Medium. Hopefully, the article is easy to understand and useful. Writing is one of the things I’ve wanted for a long time but determine an interesting topic is not easy for me. Finally, I decided to write about A/B testing, a gold method that is commonly used in many tech companies. Feel free to give me feedback for my improvement, thank you! 😊

WHY Delta Method

In A/B test analysis, it’s mandatory to do hypothesis testing whether the treatment variant is statistically significantly different from the control variant. If our metrics are continuous, this could be done using an independent t-test since we know that the variants are independent of each other.

Independent t-test formula
Variance formula

However, the mean and variance formula only applied to i.i.d (independent and identically distributed) random variables, and in real business cases our metrics are more complex. Mostly, business metrics are defined as ratios, for example, Clickthrough rate or CTR which is defined as Clicks/Views.

Clicks and views are random variables, when we combined them as a single metric CTR they will have joint distribution. Also if our randomization is based on user_id, one user is possibly generated multiple views so that the views are not independent of each other. Therefore the variance estimation formula above can’t be used to estimate the CTR variance. Here, we will utilize the delta method to approximate the variance of the metrics ratio.

WHAT is Delta Method?

Basically, Delta method extends the normal approximations of the central limit theorem. Delta method approximates asymptotically normal random variables by applying the Taylor series on the function of random variables. Suppose that we have y=g(x), we can expand the function into series such as follows:

Taylor approximation of y=g(x)

We can approximate the variance of the function as:

Variance approximation of y=g(x)

Suppose that we have metrics ratio Z = Y/X, by applying the Taylor series we can derive the equation according to the procedure above with a note that the function contains multiple variables X and Y. Without being into further technical detail, we can approximate the variance of Z as follows:

Variance approximation of ratio Z

Example

For example, we have two variants: Treatment and Control variant. Our metric is CTR = Clicks/Views. First, we calculate the mean of the metric for each variant:

Mean CTR of treatment and control variant

Then we calculate the variance of the CTR using the variance formula for ratio where Y as Clicks and X as Views.

Variance of CTR

Finally, we can calculate the t-value. If the absolute t-value is more than the critical value we reject the null hypothesis. Conversely, if the absolute t-value is less than the critical value we fail to reject the null hypothesis.

t-value of CTR difference between control and treatment

Last, I put the example of calculation using dummy data. The calculation is straightforward and simple. Even if you commonly use SQL to collect and process the data from a database, you can directly calculate the confidence interval from SQL.

Reference:

  1. Choice of randomization unit and analysis
  2. Applying delta method in metric analytics
  3. Controlled experiments on the web survey and guide
  4. Consistent Transformation of Ratio Metrics for Efficient Online Controlled Experiments
  5. Ratio estimator

Thank You!

--

--