# Quick Takes: A|B Testing

Data Science is a complex discipline that combines programming , mathematics , statistics and scientific methods, and many of its concepts can or have to be approached on all these fronts.

Early on in my Data Science journey , bright-eyed and bushy-tailed , I consistently found myself trying to fit in as much content I could , going through hours (sometimes days) of googling , staring at screens , reading related literature, cramming and coding as much as I can. Helleuva ride! (s/o SG-DSI-5!)

That was then.

I learnt a lot in a short time , but also realised with most things that are fast, the concepts that I though I understood started to blur.

In the the spirit of ‘use it or lose it’, this post (I hope) will be the first of a series of possible questions we might get as Data Science Practitioners and a possible outline of an answer. The aim is prepare a concise way (5 minute tops) to answer common questions , I also identify fringe concepts that the conversation can ‘segway’ into.

#### Possible Question:

‘Gargle’ — A search engine for people who want to find specific types of mouthwash, wants to add a new button to their main search page. What’s a way they can determine whether or not people enjoy this new button feature?

#### Possible approach:

A|B Testing will be good way to solve that!

Key concepts we can highlight are Hypothesis testing, Confidence interval, Selection Bias

First, decide on the metrics common ones are: Daily/Monthly Active Users, Click-Through Rate, User Engagement through time etc.

Next, Prepare 2 or more versions of site/page , with a control (without feature changes), and 1 or more for each version of the button you want to test.

We will subsequently serve the different versions of the page to the population

Point of emphasis : we have to split the population randomly! Be careful not to introduce bias by not selecting randomly.

Once set-up is done, we can define the statistical hypothesis test: create a null hypothesis for your hypothesis testing, i.e. the CTR will be the same for control group VS group using new features.

Then, pre-define the acceptable Type I Error (False Positive Rate aka The Alpha)

Use this to determine whether there is statistically significant evidence to reject H0.