Making research open and reproducible: An early career researcher’s perspective
As an early career researcher (ECR), making the transition from the “traditional” way of doing science into methods that are more open, reproducible, and replicable can be a daunting prospect. We know something needs to change about our workflow, but where do we start? Sometimes it seems easier to keep putting the transition off. After all, PhDs are hard, post-docs are hard, and we’re constantly developing all sort of skills and just don’t seem to have enough time to do everything. One day, we will have the time and resources to tackle it, but not today. Maybe tomorrow.
At some point, we have to say “enough is enough”. Last year, after 2–3 years of knowing I wanted to radically change my practices, I started working on some of the skills I might need. It was harder than I thought, and I largely failed. Why? Because I tried to do too much too soon. Learning how to write pre-registrations and how to use R and GitHub was a bit overwhelming. The idea of sharing my (very amateur) code seemed laughable; it’s something that never ought to see the light of day and could only end in my own embarrassment. It always feels like there are so many barriers, and to pretend that it isn’t a difficult process would be misleading. But the benefits really do vastly outweigh the amount of effort required, and ultimately it will lead both to the advancement of our respective fields and ourselves as researchers.
By writing and sharing your hypothesis and analysis plans, in the form of pre-registrations or registered reports, you are confirming that you are following the hypothetico-deductive method. This prevents your results from being muddied by p-hacking or HARKing (hypothesising after results are known), common questionable research practices that undermine the scientific process. By performing your data pre-processing and analysis in open-source software like R, and uploading this to a repository along with your data, anyone can then check that your work is reproducible; that when they run the same code with the same data, they get the same results. Additionally, by being clear about your methodology, publishing open access, and providing any necessary materials in a repository, you are helping others to check whether your work is replicable; that the conclusions hold up when others run the same study.
The Advanced Methods for Reproducible Science Workshop in January aimed to teach ECRs across many disciplines the skills needed to incorporate these practices into our normal workflow. The workshop was held at Cumberland Lodge in Great Windsor Park, and I was lucky enough to have been accepted as one of the attendees. We spent 6 days learning how to use GitHub and R Markdown for version controlled and reproducible data analysis, write protocols for pre-registration, simulate data and conduct power analyses, use Bayesian statistics to make inferences about the strength of our evidence, and much more.
Importantly, for me, I learned all the techniques I had been struggling with alone, as the workshop forced me to tackle the reasons why I had yet to fully engage. Not only did I learn the skills, but the tutors and other attendees ignited a fire within myself to keep pushing forward even when it feels difficult. Community is key. We discussed the hesitancy to share our code that we were embarrassed by and decided that it was better to have someone point out problems in our code in order to correct mistakes and teach us how to improve than it was to protect the code and never give ourselves the opportunity to move forward. We discussed how it can be difficult to negotiate relationships with supervisors who are resistant to supporting open research, and ways that we can build up the ECR community and support network. We also discussed how the current incentive structures are not conducive to reproducible and replicable research, as better research is usually slower research.
The key message that I took away from the workshop, however, was that you cannot hope to change all your practices overnight. Small steps are better than no steps at all, and you have to start somewhere. I would encourage all ECRs to pick one thing they’d like to change about their workflow in the interests of open and reproducible research, and try to incorporate it into their next project. Importantly, connecting with like-minded researchers is key. We have recently established the University of Manchester Open Science Working Group (open to all researchers, both in and out of science) which operates under the umbrella of the UK Reproducibility Network. Our members are from all career stages across all faculties and anyone interested in taking that first step, finding out more, or having a space to ask questions or allay any concerns are welcome to join us. You can sign up to our mailing list here.
So far, we have benefited tremendously from seeing the issues that occur in other disciplines and the steps taken to overcome them, such as the great work that the Software Sustainability Institute are doing. Finding a community where we can all work together on problems that exist outside of our usual research bubble is key to implementing better practices individually, especially for ECRs as we’re still building our personal networks. As Dr Kirstie Whitaker taught us at the workshop, the best way to do start doing better - and more open - science is to find your tribe.
Jade Pickering is a final year PhD student in the Department of Neuroscience and Experimental Psychology. She co-leads the newly established ReproducibiliTea journal club which is part of the University of Manchester’s Open Science Working Group.