P-value interpretation

Little Dino
3 min readMar 9, 2022

--

Introduction

For any statistical test, we have a hypothesis to test. Often, researchers use p-value to accept or reject a null hypothesis. In this article, we’ll discuss how to use p-value to make a judgement call.

However, you should first bear in mind that p-value is NOT enough! Never solely rely on p-value to make a conclusion.

What is p-value?

p-value is the probability under a specified statistical model that a statistical summary of the data (e.g., the sample mean difference between two groups) would be equal to or more extreme than its observed value (ASA statement).

It’s quite confusing. Think about it this way. The specified statistical model is H0. The proposed statistical summary is based on the hypothetical data where H0 is True. Assume the statistical summary is the mean difference between two groups.

So when the probability of this proposed mean difference is equal to or more extreme than the mean difference of our observed data is high, we do NOT reject H0. Remember our proposed data is based on H0? H0 is the hypothesis that there is NO difference between 2 groups. When the proposed mean difference is close to or stronger than the observed difference, then our observed difference must be so weak that we CANNOT reject H0.

If it’s still confusing, p-values essentially indicates how incompatible your data are with a specified statistical model (e.g., H0).

Now you get the idea. If p-value is small or close to 0, we reject H0; otherwise, we do not reject H0.

⚡ Rejecting H0 just means the evidence we have (our data) is against H0. Nevertheless, the truth could be there is bias in our data, or we just don’t have enough data. Thus, we CAN’T prove H0 wrong, we just reject it from the angle of statistics.

Draw a conclusion

We reject H0 when p-value is small, but when is p-value small enough? Well, that depends. A popular threshold (or significance level) is 0.05. Namely, if p-value is less than 0.05, you reject H0.

And be careful, the conclusion you draw when rejecting H0 should be “There is evidence in our data indicating the relationship/difference we observed is statistically significant, so we reject H0”. Similarly, the conclusion for accepting H0 is “There is NO evidence in our data indicating the relationship/difference we observed is statistically significant, so we do not reject H0”.

⚡ When we accept/do not reject H0, we DON’T prove it correct. We just don’t have enough evidence to reject it. I am repeating myself I know, but this concept is so important in terms of interpretation.

What then?

If p-value is high that you can’t reject H0, then you should either collect more data or propose a new hypothesis. It’s meaningless for you to do further analysis on this dataset (which doesn’t contain enough evidence).

On the other hand, if p-value is low that you reject H0, then voila, research ends.

No, I am joking. We haven’t proved anything! We reject H0 because the evidence suggests the relationship/difference we observed is statistically significant, so we do FURTHER analysis.

We want to know the degree of the relationship/difference. We want to introduce more independent variables and investigate the causation. We got whole lot of things to do after rejecting H0! p-value is a start, not the end of our research.

References

  1. https://absolutelymaybe.plos.org/2016/04/25/5-tips-for-avoiding-p-value-potholes/
  2. https://www.tandfonline.com/doi/full/10.1080/00031305.2016.1154108#.Vx2EN5MrI_U

--

--

Little Dino

Welcome to my little world! I LOVE talking about machine learning, data science, coding, and statistics!