Hypothesis Testing: Why is it wrong to accept the Null Hypothesis?

Pankaj Agarwal
Analytics Vidhya
Published in
5 min readJul 30, 2021
Failure to reject a hypothesis doesn’t imply that the hypothesis is True

Introduction

In an A/B test, we use hypothesis testing to check for significance about whether our new feature is actually improving the metrics of interest as compared to control group. During the analysis, if p-value turns out to be greater that 5%(say), I sometimes conclude that we accept the null hypothesis i.e. the new feature is not any different compared to the old feature. Is the statement wrong ? After all, if we cannot reject the null hypothesis then what is the harm in saying that we accept it ? This is actually a very wrong conclusion to make from this test. In this blog, I will try to intuitively explain the reasoning behind this.

Human Cognitive Bias

Let me explain this bias using a game that I played with a friend.
Imagine I have created a rule in my mind and my friend is supposed to guess what that rule is. To do that, I will tell him 3 numbers which follow the rule I created. My friend can ask me if another set of 3 different numbers follow the rule or not. I can either confirm or deny that.

Me: The three numbers which follow my rule are 2, 4 and 8. Guess my rule ?

Friend: Okay ! Does 6, 12, 24 follow your rule ?

Me: Yes, it follows my rule !

Friend: Okay, so it’s simple, the rule is “x2” (multiply each number by 2).

Me: No, that’s not the rule.

Friend: Does 10, 20, 40 follow your rule

Me: Yes, it follows my rule ! But my rule is not “x2”.

Friend: How about 1000, 2000, 4000 ?

Me: Yes, it follows my rule. But my rule is not “x2”.

Friend: That’s mind baffling :| How is this possible ? I don’t know what your rule is then.

Me: Let me give you a hint .Why don’t you try to disprove the rule which you have in mind rather than proving it ?

Friend: Okay, Let me check for numbers which doesn’t follow “x2” rule. Does 2, 4, 7 follow your rule ?

Me: Yes, they follow.

Friend: Okay that’s crazy ! How about 5,4,3 ?

Me: Nope, that doesn’t follow my rule.

Friend: Okay, I see ! Is your rule, that all 3 numbers must be in ascending order ?

Me: Yes !! You guessed it right.

Conclusion — A set of observations can come from multiple truths / hypothesis/ theories. It’s often very hard to prove a theory using empirical evidence. Because to do that you need to disprove all the other possible theories which explain the observed data which is impractical. That is why we find it easier to reject a theory rather than trying to prove it. And if we are unable to reject the theory basis the evidence, an appropriate statement is to say that we fail to reject the hypothesis under trial. We cannot say that we accept this hypothesis.

Explanation using Court Verdict

If a court pronounces an accused not guilty, does that mean that the accused is innocent ?

In most court proceedings, it is assumed that an accused person (say Alex) is innocent until their guilt is proven. The job of the prosecutor is to prove him guilty beyond a reasonable doubt. Then on the basis of produced evidence and arguments, the judge/jury arrives at either of these two conclusions —

  1. Guilty: Basis the evidence, the court is convinced that Alex committed the crime and hence is declared guilty.
  2. Not Guilty: Basis the evidence, we could not come to a conclusion that the accused has committed a crime.

In the latter case, is the judge / jury sure that Alex is innocent ? The answer is No. Even if Alex has committed the crime, the evidence is not sufficient to prove the hypothesis that Alex is guilty beyond a reasonable doubt. May the investigation was not full proof and thus the evidence collected was insufficient. The court can make a mistake here but it prefers to err on the side of False Negative.

Statistical Reasoning

My goal is to statistically examine whether the coin is biased or unbiased. As you might have guessed, it’s easier to reject the hypothesis that the coin is unbiased that try to prove it is unbiased. Hence, we start perform an experiment and perform z-test with the below setup —

Null Hypothesis (H0): p = 0.5 ( p = true probability of heads)

Alternate Hypothesis: p not = 0.5

Experiment: We tossed the coin 5 times and we got 4 heads.

We can perform z-test where we assume Null hypothesis is true.

z = (observed probability of heads — true probability of heads) / standard_error

z = (0.8–0.5) / (sqrt(0.5*(1–0.5) /5 ) ~ 1.34

From the normal distribution charts, we can get that —
p-value ~ 18% ( since this is > 5%, we should conclude that we fail to reject the null hypothesis.)

Instead, if I were to say that I accept the null hypothesis ? What’s wrong with that ?

To illustrate the problem with this, let’s repeat above z-test with a twist. This time, we will be using a different pair of null and alternate hypothesis.

Null Hypothesis (H0): p = 0.55 ( p = true probability of heads)

Alternate Hypothesis (H1): p != 0.55

Again, on the basis of same results, we can compute z-value and p-value.

z = ( 0.8–0.55) / (sqrt(0.55*(1–0.55) /5 ) ~ 1.12

p-value ~ 36%

Once again, I can say that I accept the null hypothesis since p-value is > 5%.

One can see the problem with acceptance now. For the same coin, we are claiming that two hypothesis (p = 0.5 and p = 0.55) are correct which is clearly absurd. It can only have one true probability of head. Similarly, there are several other hypothesis which will seem plausible for above results.

Instead, if we restricted ourselves to concluding that we fail to reject the null hypothesis in both the tests, then that makes more sense.

Conclusion

There can be various reasons for getting p value greater than 5%. Some of the common ones are —

a. The sample size or power of the test was low. Hence, we could not detect the change. Increasing power may bring the p-value to < 5%.

b. The observed data can actually be explained only using Null Hypothesis. Thus, no amount of sample size can bring down the p-value to < 5%.

c. The observed data can be explained using many hypothesis including the null hypothesis. Thus, claiming that null hypothesis is true would be incorrect.

I hope with the three different kinds of explanation, I was able to make you appreciate the point that “not able to reject a hypothesis is not the same as accepting it”.

References

  1. Human cognitive bias was shamelessly copied from a video by Veritasium channel — https://www.youtube.com/watch?v=vKA4w2O61Xo&ab_channel=Veritasium
  2. https://medium.com/analytics-vidhya/binary-classification-vs-hypothesis-testing-explained-using-real-life-covid-19-use-cases-a017a728650d

--

--

Pankaj Agarwal
Analytics Vidhya

Building Search and Recommendation Systems at Myntra !!