Nature Journals’ pilot with Code Ocean: a Developer Advocate’s perspective

Suppose that you are a graduate student in computational biology putting the final touches on a methods-focused paper. You would like to have your software pipeline available to your manuscript’s referees, but you’re unsure of how much time or interest your reviewers have for manually editing or debugging code.

Imagine further that your software reflects the contributions of three people. Your PI writes in MATLAB; your co-author uses R’s Bioconductor packages; and you developed your Python code as a Jupyter notebook, using TensorFlow and executing on a GPU. Somehow, you need to ship all of your materials to an unknown party, and you think that they will evaluate your code based on both its documentation and ease-of-use.

Where to begin? Do you write a shell script that runs each of the software pieces in turn? (It will be difficult to guarantee that this works across different operating systems and software versions.) Do you provide your work as is, along with a detailed readme of how to run each part? (Reviewers’ receptiveness to such steps is unknown.) Do you package everything up in a virtual machine and provide instructions for reviewers to set it up? (Some reviewers may lack the right hardware and software licenses to make this a reality.)

These challenges, multi-dimensional and ubiquitous, are why Code Ocean is excited to be working with Nature journals on a peer review pilot, providing editors and reviewers access to fully configured and executable versions of authors’ code. We can’t eliminate all technical hurdles that authors and reviewers face, but:

  • A Code Ocean compute capsule allows an author to configure their computational environment precisely, and guarantees reviewers access to that exact same environment — without needing to download or install anything;
  • Code Ocean is designed to run multiple programming languages in one environment with the click of a single button;
  • Publishers and researchers can link a published capsule as a “version of record,” with an associated DOI, to an associated article, thereby enabling readers to reproduce published findings with the click of a button;
  • Perhaps most importantly, Code Ocean guides users towards creating one canonical set of ‘Published Results’, and making transparent from start to finish how those results were generated.

More generally, as Code Ocean’s Developer Advocate, I have watched from the sidelines as academic pipelines have exploded in complexity. I have not, however, observed a commensurate rise in anti-entropy practices from software engineering — in part because developer-born tools and practices are not clearly a natural fit for academic work. This is where Code Ocean comes in. We aim to adapt tools such as Docker to be intuitive for researchers at all levels of technical experience; moreover, we seek to guide users towards best practices in reproducibility, such as executing Jupyter notebooks from top to bottom by default, at all stages in the process of developing and publishing code.

Should you have any questions about the partnership, about our future development plans, or anything in the reproducibility ecosystem, please be in touch, we would be happy to hear from you.

Seth Green is the Developer Advocate for Code Ocean. He helps authors publish their code on the platform and tries to represent researchers’ points of view within Code Ocean. He spent a few years in a political science PhD program before joining Code Ocean. Find him on twitter @setgree

Originally published at