Fixing the Problem of P-Values in Scientific Research
There has been a trend among scientists and researchers to rely on p-values, a measure of statistical significance. P-values of less than 0.05 have become a determinant for inclusion in academic journals, motivating the production of studies to yield those results.
This trend was evident in a study recently published in The Journal of the American Medical Association  by a team of researchers, including John P.A. Ioannidis, MD, DSc, professor of disease prevention and of health research and policy and co-director of the Meta-Research Innovation Center at Stanford, with lead author David Chavalarias, PhD, director of the Complex Systems Institute in France. According to their research, during 1990–2015, 96 percent of the over 1.6 million biomedical research papers studied contained a reported statistical significance of a p-value of 0.05 or lower.
Why is this a problem? The p-value has serious limitations; it is not an indicator of how likely a result is to be true or false. It is not a replacement for scientific reasoning.
Research is being published and touted as “statistically significant” that in actuality, may or may not be replicated, leading to lower quality scientific findings. In a study published in Science , researchers replicated 100 studies that were published in 2008 across three leading psychology journals. Although 97 percent of the original studies reported a p-value of 0.05 or less, only 36 percent of the replications had the same results.
For the first time in its history, on March 7, 2016, the American Statistical Association (ASA) released a “ Statement on Statistical Significance and P-Values” in effort “improve the conduct and interpretation of quantitative science and inform the growing emphasis on reproducibility of science research.” The statement’s six principles are:
- P-values can indicate how incompatible the data are with a specified statistical model.
- P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
- Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
- Proper inference requires full reporting and transparency.
- A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
- By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.
The new guidelines will provide greater transparency, increase reproducibility of study findings, and improve scientific rigor across all areas of scientific research and development. It raises the standards on what will be published in academic journals going forward.
- JAMA, Vol 315, №11, March 15, 2016.
- Science, Vol. 349, Issue 6251, 28 Aug 2015.