Hypothesis Testing

Sai Krishna Dammalapati
3 min readMar 11, 2024

--

Pre-requisites:

  • Central Limit Theorem(CLT): No matter what the population distribution is, the distribution of sample means is going to be normally distributed.

Normal distribution has thin tails (and heads) — Extreme cases (outside 3 std) have only ~1% chance of happening. Hope you also ready my blog on Fat Tails, a case when CLT fails

You conducted one experiment and saw that Drug A helps patients recover 15 hours faster than Drug B. This becomes your hypothesis.

Hypothesis: Drug A helps patients recover faster than Drug B.

To test the hypothesis, we should do experiments.

Say you perform the experiment 1000 times.

  1. Collect a random sample from the population
  2. Split the sample into two groups randomly. Give Drug A to one group and drug B to another. Measure how fast each group is recovering.
  3. Subtract these two measurements to find — how fast is Drug A in recovering patients.
  4. Store results.

As per the Central Limit Theorem, these results will be distributed normally. This means, there would be differences in magnitude of the effect. In one experiment Drug A would help recover in 10 hours; 20 hours in another.

Say 95% of these results validate your hypothesis. That Drug A helped in recovering faster. Then you fail to reject the hypothesis

But if you don’t get favorable results (Drug A is worse in many experiments), you can happily reject the hypothesis. Drug A is not helping patients recover faster.

But you cannot do multiple experiments in real life. India’s expenditure on RnD is among the lowest in the world. You don’t have enough money to conduct many experiments. And hence, you take help of Statistics. Because, Indians are good at math.

We define something called Null Hypothesis — that there is no difference between Drugs A and B.

And if 95% of experiments give me the opposite result, I can reject the Null Hypothesis. Then, A is different than B.

But, here comes the statistics, I need not do multiple experiments. I will just perform one experiment and see where the result falls in the sampling distribution, considering my Null Hypothesis is true.

If the experiment result falls close to the mean (green ball) of the sampling distribution, then I cannot reject the null hypothesis.

If the experiment result falls 3 standard deviations away from the mean (red ball) , then as per the empirical rule of bell curve, there is less than 1% chance of this sample belonging to the sampling distribution. The sample is probably coming from some other distribution. Which means, I can reject the null hypothesis with 99% confidence. I reject that there is no difference between Drugs A and B.

To quantify all the above concept, we use p-Values to test the null hypothesis. Typically, we reject the Null Hypothesis if p < 0.05 (95% Confidence). More on it in the blogs coming next!

--

--

Sai Krishna Dammalapati

Interested in inter-sectoral areas of Technology and Socio-Economic Development.