Science is Changing, Can I Change Too?

Science is changing. Change is hard though and I have struggled while trying to reconcile my own self-interested pursuits (e.g., being right, tenure) with what I see as crucial changes in methodological and statistical approaches across science. This transition time can be a hard one for those of us with old habits: Even though I’m convinced by the many eloquent and logical arguments for things like pre-registration, direct replication, open science, and an end to questionable research practices, the truth is that admitting we did things wrong before means that I did things wrong before — that I might be here because I did things that I now see as objective errors. That’s a hard pill to swallow.

These hard realizations require a response, and in the past three years I have tried to transform my own approach to science. This includes a shift in my lab to more direct replications, open data and materials for manuscripts, designs with higher statistical power — including larger N and the use of within subjects designs, avoidance of questionable research practices (e.g., optional stopping, selective reporting, outlier exclusion), pre-registration, and a different approach to manuscript review which involves weighing the strength of the evidence in addition to theory and novelty.

Have I been successful in changing the ways in which I conduct science? Let’s take a look at the evidence.

Like some of my more courageous colleagues, I conducted an audit of all my published individual studies since that started in January of 2009. I divided my published record in half based on my own awareness of questionable research practices and open science — which started somewhere around late 2011 to early 2012. The studies published between 2009 and 2012 represent my pre-awareness period and the ones published between 2013 and the present come from my post-awareness period. To evaluate the quality of my methodological practices in these two periods I used the most relevant hypothesis from each study to identify a focal statistical test and then I recorded the test statistic for that focal test in a spreadsheet here. I then plugged the focal statistics into the p-checker app. The results have been summarized below:

M. W. Kraus (2009-2016). Audit of Research Practices

What you can see from these data is an improvement across all measures of publication bias between the pre- and post-awareness time periods. Aside from the p-curve, my pre-awareness R-index and test of insufficient variance (TIVA) are consistent with the existence of an unhealthy amount of publication bias — the first (R-Index) suggests the likelihood of successful replication whereas the second (TIVA) suggests variance between study results is potentially reduced by publication bias. In contrast, the post-awareness period shows improvements across all three metrics of publication bias (though TIVA is still < 1). I’m bummed about where I’ve been, but proud of where I’m going.

You can also see some changing design choices reflected in my post-awareness period. The median sample size has increased from N = 109 to N = 178, and I actually hope to lift my minimum sample size to N = 200 for the rest of my career. Implementing larger samples can mean moving online for more of your data collection, and you can see that happening with my data collection practices: Whereas less than 20% of my published studies came from online crowd-sourcing survey communities like MTurk in the pre-awareness period, that number is closer to 42% now. In addition to these online data collection practices, I have increased my lab/field N by using more creative data collection strategies in the field (see here) and by searching for large pre-existing data sources (e.g., here). Besides increasing N and implementing more within subjects designs I have also looked for ways to elicit larger effects with more powerful manipulations and more extreme samples.

I also started to adopt some new scientific practices that I hope will improve the robustness of my own research: Whereas my pre-awareness period includes zero (!!!) direct replications of focal results from my own studies I have started including these more frequently into new study designs. My goal is to devote one project per year exclusively to this direct replication initiative. I have also started the process of making data and materials available for my published studies, and will continue to expand these open practices. I’ve even published two studies with pre-registered data collection and analyses. I‘m also actively looking for additional tools and strategies to up the reproducibility of my research.

One consequence of these changing practices is that my science has slowed down a little bit — with fewer articles published in the post-awareness relative to the pre-awareness period. Higher quality methods require more time and effort per study. This is a trade-off that I think everyone can become more comfortable with over time, especially as it should accompany more reliable results. Qualitatively, I also find that each of my newer papers is a little more narrowly focused on a particular hypothesis — a feature consistent with having to implement more direct replications of prior research.

“Openness is not needed because we are untrustworthy; it is needed because we are human.” — Brian Nosek, Jeffrey Spies, and Matt Motyl (here)

Science is a big thing, but changing it relies on simple decisions made by individual researchers. I wrote this as an admission that I need to change, and as a call to action for others. I hope you will answer the call in your own labs and in the pages of our journals.