Why I don’t trust scientific research
And why p-hacking is tainting many academic fields

P-hacking has made me suspicious of all scientific research. No, I’m not a global warming denialist, conspiracist, lunatic, or Trump supporter. And yes, I believe that science has — and will continue to — play a critical part in the advancement of humanity. But, p-hacking is the malicious alteration of a studies p-value. And its practice has tainted many research papers and academic fields. It’s vital that we question not only the methods of scientific research but also the approaches used to process the data drawn from studies. This article to explain why.
The validity of many research papers depends on a good p-value. But this measure can be, and often is, manipulated by scientists to achieve the results they want.
What is a p-value?
P stands for probability, and a p-value donates the value at which an experimental result can be considered statistically significant. Because a p-value can be calculated for many different types of experiments, it has become a standard marker of robustness in research. A p-value of less than 0.05 is agreed to be the benchmark for a fruitful hypothesis. This means that there is a less than 5 percent chance of a correlation being the result of chance. The issue is that the value of 0.05 becomes a target for scientists to reach; something that I have seen first-hand.
What is p-hacking?
I studied economics at King’s College London, and during my second year, I took a course in econometrics. During one seminar, the convener handed the class ten questions to complete with the use of data from various sites — including the United Nations — and a statistical analysis programme called Stata.
One question required me to regress a countries corruption perception index with economic growth — which did, as expected, produce a negative correlation. But, the p-value was high — meaning that it wasn’t statistically significant. I asked the convener what I had done wrong. She said that I had done nothing wrong. All I had to do was to include other variables in the regression until a p-value of less than 0.05 was produced.
By including specific variables and excluding others, I was able to create a p-value of 0.05. This is p-hacking — throwing variables at the data wall and seeing what sticks. And this wasn’t a one-off. Because the p-value is entrenched in the field of economics, every question I answered over the course had to hit this value of 0.05. I explained my concerns with this method to my university tutor.
She told me that despite everyone knowing that the p-value was flawed, no one dared question it because that would mean throwing away years of ‘good’ research. The is crazy. Academics are adding and culling any variables to create the desired outcome. Despite knowing that the practice is flawed.
P-hacking isn’t limited to economics; an analysis of 100,000 open access papers in 2015 found evidence of p-value manipulations across many disciplines. The study was published in PLOS ONE — a leading medical journal — and it’s editor — John P. A. Ioannidis — published an editorial attacking statistical methods entitled ‘Why most published research findings are false’.
The validity of many research papers depends on a good p-value. But this measure can be, and often is, manipulated by scientists to achieve the results they want. There is a fundamental problem at the base of much research that is tainting entire fields of study. The cause for this is hard to determine, but its effect is profound. Always check the data processing techniques when reading research papers.
