Five reproducibility lessons from a year of reviewing compute capsules

Presented as three observations, an inference, and five recommendations.

Code Ocean
Code Ocean

--

Observation 1: Preparing your code and data for reproducible publication has a way of unearthing gaps, misunderstandings, and bugs that is not at all obvious until you try to do it.

Observation 2: We have a wealth of existing resources on making your code reproducible (see, e.g., entrants into PLOS’s 10 Simple Rules collection, such as Ten simple rules for making research software more robust or Ten Simple Rules for Reproducible Computational Research). My personal favorite is Karl Broman’s initial steps toward reproducible research.

Observation 3: As Code Ocean’s Developer Advocate, I’ve checked for and verified the computational reproducibility of a few hundred capsules spanning, conservatively, 15–20 separate disciplines and 13–15 programming languages — and therefore think that any comparative advantage I’ve developed is in breadth rather than depth of knowledge of reproducibility issues.

Inference: though your first resource on reproducibility matters should probably be from someone in your research domain, I can offer a somewhat unique perspective on interdisciplinary reproducibility challenges — things that come up over and over again, across domains, regions, languages and programming languages. Five come to mind. The first four are repeats of advice I’ve seen elsewhere and concern getting code to actually run, and the fifth concerns the relationship between inspectability and reproducibility.

  1. Write a master script. a file called main.r comprising:

source('script1.r');
source('script2.r')

is a huge improvement over a readme that says: “run script 1, and then run script 2.” First, this provides one canonical example of how to run your scripts,¹ at the least a good starting point for modification and extension. Second, it helps you (the author) see if everything really does run from top to bottom as you expect. I pretty frequently encounter code that only works in chunks, or wherein one script changes a data file in ways that break another script. I am reminded of my time in youth symphony — playing from the top has a way of revealing surprises.

2. Specify dependency (library) versions. Breaking changes between versions of packages are not uncommon. Using Code Ocean is a fairly robust solution to this problem in that using the package management system will record and archive precise versions of packages (e.g., matplotlib==2.1.2). But the simpler step of recording which versions of which libraries you used somewhere — akin to specifying how much flour a cake needs rather than writing “use flour” — is a huge step forward in helping other troubleshoot unexpected results without contacting you. In Python, run pip freeze; in R, use sessionInfo().

3. Presume non-interactivity during runtime. This maxim has two sub-components; 1) generally, don’t ask users to click or select anything during runtime,² just script everything, and 2) save results explicitly rather than assuming they’ll pop up. A reader may, for instance, be executing scripts on a remote server without a display; or they may step away from their computer while everything runs, and not like having the program pause and wait for input. To the extent that you can condense the process of reproducing your results to a few clicks or words typed, you’re reducing reproducibility friction.

4. (I “know you heard this before..”) Use relative paths. I cannot access a file at C:\Users\admin\Desktop\awesome_code. Set your code to execute from the directory your file is in (Code Ocean takes care of this automatically); to go to a subdirectory, use ./subdirectory; to go up to a parent directory, use ../. (See Jenny Bryan’s Project-oriented workflow for more on the subject.)

5. Include source code rather than compiled programs. The narrowest, mechanical sense of the phrase ‘computational reproducibility’ that we employ is ‘if you press a button, you wait a little bit and then get results.’ But this doesn’t quite convey the richness of the experience of following along with someone else’s work, step-by-step, watching scientific results take shape. That can’t happen if your program is a .so , .exe, or a .mex file file comprised of zeros and ones. More importantly, a reader cannot verify that the program doesn’t just spit out the right answer without executing the specified analyses. Reproducibility entails inspectability, and if compiling those files at execution makes running your code take a little longer, this is a fine price to pay, I think.

These steps will help make your code reproducible on whichever platform you choose to develop and publish it.

Questions, comments, concerns, virulent disagreement? I’m seth@codeocean.com and I’m happy to discuss further.

‍¹ This might seem like a small detail, but it can have unexpectedly large impact — case in point, the command line tool Rscript has slightly different default behavior than initializing R and then sourcing a script.

² This will not be possible or desirable for all workflows, but if possible, interactive workflow should be secondary and should output and record the selected parameters.

Seth Green is the Developer Advocate for Code Ocean. He helps authors publish their code on the platform and tries to represent researchers’ points of view within Code Ocean. He spent a few years in a political science PhD program before joining Code Ocean. Find him on twitter @setgree

Originally published at codeocean.com.

--

--