Invest in open science now; why looking back at work that could have been better sucks

Sam Parsons
Nov 16, 2018 · 7 min read

TL:DR; Looking back at work you did before learning about open science is painful. But, having foresight to ask “should I learn about open science before running this study?” the “could” in the title changes to a “should” and it’s going to be far more painful.

Shout out to Melanie Imming‏ for the awesome sticker!

Open and reproducible science is just science done right. But, many early career researchers are worried that they will be harmed in some way by doing open research. I’ll use open science as an umbrella term here for engaging in open and reproducible research practices and reforms; preregistration, open code, self-archiving papers for open access, registered reports, open and well annotated data. You might add learning new skills and improving your statistical understanding to this list. In almost any talk on open science the same fear arises in one form or another. A few examples:

  • If I take the time to learn open science skills or extensively document my data/code to be shared, will I publish less?
  • If I am open but made a mistake, then will I have to retract my paper — again, hurting my career prospects?

The list goes on. Ultimately, these fears revolve around the idea that early career researchers (ECRs; here I mean anyone pre-tenure track, purely because we are the most vulnerable) might be ‘shooting themselves in the foot’ by doing what amounts to spending time doing better work and learning useful skills. One of the barriers to pursuing open science and learning new skills is the ingrained idea that this would be at the expense of marketable things like the number of papers they have published, the length of their publication list, and, oh yes, their impact factor/H-index/<insert other almost entirely useless metric>.

Will running my next study as a registered report take longer and, as a result, put me at a career disadvantage?

— paraphrased from many ECRs concerned about their career prospects

This question, and many others, highlight that concerned ECRs are thinking hard about the pros and cons of changing or adopting a different research practice. On one hand this is a good thing; ECRs are engaging with the idea of using open science practices and are thinking critically about what they can and cannot achieve in their current time-frame. On the other hand, I’m not convinced that the typical, well-meaning, response is entirely helpful. Open science advocates usually try to say ‘yes you will lose time, but it will get easier and eventually you will save time by doing these things, and your science will be better, etc’.

I think that these responses focus too heavily on the negatives. More importantly, these fears and responses miss a separate gut-wrenching realisation altogether; looking back at one’s work and realising that it could have been much better.

Looking back at work that could have been better is not fun

I started to (slowly) read about reproducibility, open science, and the associated reforms around two years ago. Since then, my appreciation for open science and desire to engage in it has increased ten-fold. Unfortunately, I learned all of this at a time when I could do little to act on it. Most of my DPhil research had been run and analysed. Yes, I preregistered the analysis plan for a replication and extension. But, using what I know now to critically assess the original study, I would not have attempted to replicate it at all. I did what I could to engage with open science and being critical of my work. But, you can’t really save a study that has already been run.

As a result of engaging with open science, I am convinced that three of my DPhil studies (including data collected from a few hundred participants) will make little, if any, contribution to my field (even if I am able to publish them). I am sure that I am not alone in feeling unhappy with some of my previous work. But, this realisation was more than personal growth (we all wish our previous selves knew what we knew now), it is building an understanding about the discrepancy between how science is typically conducted, and how it should be conducted to maximise rigour.

The only works I am truly proud of that arose tangentially to my DPhil are a methods paper and developing a package in R. These briefly appeared in a short appendix in my thesis. Yes, some of the chapters are published/under review, and thankfully these are aspects I am happier with (though these two empirical studies were largely driven by others and I mainly just ran the analyses). But, that does not negate that three studies worth of time, resources, and data (approx n = 200–300) feel wasted to me now.

Using time to invest in open and reproducible science practices for future studies is a far cry from losing time and resources because your past studies are not robust. Especially if you already know enough to ask the question beforehand.

Being able to posit questions like “should I approach my next study as a registered report?” or “should I take the time to make my data and code open?” is a privilege, and one I fear is being overlooked due to the fears mentioned earlier. I beg that ECRs who know about these best practices invest in them. Pay the time forward, if you have the understanding and foresight to ask these questions, you know that they will yield higher quality research. By doing so, you avoid looking back in anger at sub-optimal work that, with more investment up-front could have been substantially more robust.

Insert “funny” joke about Oasis; maybe, “start a [credibility] revolution from my bed”?

I wish I had known about open science when I started my DPhil. All that time folk are worried about ‘losing’ to doing open science, I wish I had invested that time years ago.

If you know enough to worry about a time-trade-off for open science; your future self will know more and will wish you had invested earlier

I’m happy to be proved wrong and I certainly don’t want to project my experiences onto the experiences of others. This might be merely a reflection of my sincere belief that open and reproducible science is a necessity. But, my experience has led me to firmly believe that we should all strive to invest a few months to learn skills and produce research of greater value (reproducible, open, improved statistical analyses and inference). Invest that time now and produce work that is valuable to the research community. Don’t do what I did and have three chapters of your thesis represent little more than regret.

This is why I think all students should be taught about open and reproducible science practices from the start. This is one reason why I get so god damn pissed off when students have not heard of registered reports and open code. Remember, it’s not their fault, it’s an omission in teaching. Because, if I had known this entering my DPhil, the whole thing would have been something worthwhile, rather than being something that I am pretty much ashamed of. If you are in a position to ask “should I invest time in open science” before running your research, you can avoid feeling as I do looking back. Hindsight is a bitch my friend.

If I had known about open science before I ran those studies, and ran them that way anyway, I’d be truly ashamed of myself.

We need a stronger, positive message we give about investing time in open science

We say things like — “yes you might publish less”, or “it will take longer at first, but eventually you will save time” to PhD students worried about needing to publish as much as possible to get a job. But, let’s not assume that all studies are published that’s fucking madness. Say, for instance, that a PhD runs a registered report and two preregistered studies- all open everything- versus another PhD that runs six smaller, closed practice studies. By the end of a 3 year PhD, which student will have more published papers? Who will be more competitive on the academic job market? Chances are the number of publications will be roughly equivalent, but you will a) be prouder and more confident in your papers, b) won’t have a file drawer of maybe useful, but lower quality non-transparent, papers, and c) will have demonstrated a capability to conduct highly rigorous work with a wider skill-set than most.

Which researcher would you hire? Image borrowed from @hayleyjach, talk by @siminevazire

We need a better narrative, something like “you want everything that you try to publish to be the highest quality, invest in better research and develop your skills”.

It is far better to invest time up-front, than feel like you have wasted time looking back. I hope my story explains why I think this is important. I don’t think we talk about this enough.

Damn, this has turned into a minor revelation about my feelings on the research I spent the last 3–4 years working on.


Just a few resources that might be useful/interesting.

No, it’s not The Incentives — it’s you — a blog post from Tal Yarkoni

www.osf.io

Registered reports — www.cos.io/rr

Open Science MOOC — https://opensciencemooc.eu/

A Framework for Open and Reproducible Research Training — https://forrt.netlify.com/

Sam Parsons

Written by

Dphil (PhD) in @OCEANoxford @OxExpPsy | Blog: http://medium.com/@Sam_D_Parsons | Podcast: @ReproducibiliT | Open and reproducible science enthusiast

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade