Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results [Top 100 journal articles of 2018]

This article is part 7 of a series reviewing selected papers from Altmetric’s list of the top 100 most-discussed journal articles of 2018.

Data analysis in research can be seen as mechanical and unimaginative, so it’s easy to overlook the fact that the results may depend on the chosen analytic strategy. While researchers may be aware of this, there is little appreciation for the implications in practice.

But what if scientific results are highly contingent on subjective decisions made at the analysis stage? The consequences could include results that are fraught with unrecognized uncertainty, research findings that are less trustworthy than they at first appear to be, or even observing a entirely different result.

To address the current lack of knowledge in regard to the implications of analytic decisions, the authors of an August 2018 paper[1] explored the impact of the analytic decisions of 29 teams that analyzed the same data set to answer the same research question.

It was found that analytic choices varied widely across the teams, and following on from this, so did their results. Twenty teams found a statistically significant positive effect, while 9 teams did not observe a significant relationship. These findings show how researchers can vary in their analytic approaches, and how results can vary according to these analytic choices. The authors alert that:

The observed results from analyzing a complex data set can be highly contingent on justifiable, but subjective, analytic decisions. Uncertainty in interpreting research results is therefore not just a function of statistical power or the use of questionable research practices; it is also a function of the many reasonable decisions that researchers must make in order to conduct the research.

The authors conclude with this advice:

The best defense against subjectivity in science is to expose it. Transparency in data, methods, and process gives the rest of the community opportunity to see the decisions, question them, offer alternatives, and test these alternatives in further research.

They’re clearly practicing what they preach, with the paper receiving Association for Psychological Science badges for Open Data and Open Materials.

Author abstract

Twenty-nine teams involving 61 analysts used the same data set to address the same research question: whether soccer referees are more likely to give red cards to dark-skin-toned players than to light-skin-toned players. Analytic approaches varied widely across the teams, and the estimated effect sizes ranged from 0.89 to 2.93 (Mdn = 1.31) in odds-ratio units. Twenty teams (69%) found a statistically significant positive effect, and 9 teams (31%) did not observe a significant relationship. Overall, the 29 different analyses used 21 unique combinations of covariates. Neither analysts’ prior beliefs about the effect of interest nor their level of expertise readily explained the variation in the outcomes of the analyses. Peer ratings of the quality of the analyses also did not account for the variability. These findings suggest that significant variation in the results of analyses of complex data may be difficult to avoid, even by experts with honest intentions. Crowdsourcing data analysis, a strategy in which numerous research teams are recruited to simultaneously investigate the same research question, makes transparent how defensible, yet subjective, analytic choices influence research results.

Header image source: 462634 on Pixabay, Public Domain.


  1. Silberzahn, R., Uhlmann, E. L., Martin, D. P., Anselmi, P., Aust, F., Awtrey, E., … & Carlsson, R. (2018). Many analysts, one data set: Making transparent how variations in analytic choices affect results. Advances in Methods and Practices in Psychological Science, 1(3), 337–356.

Originally published at RealKM.