Image for post
Image for post

One postdoc’s path to reproducibility

Code Ocean
Nov 28, 2017 · 4 min read

verything could have been done much faster.” This was my main reflection, just after finishing my Ph.D. Like many scientists, I relied on previously published works and tried to build upon them. If you have ever tried to reuse somebody else’s research, chances are it was a challenge.

I discovered that this “reuse” problem was a widespread global concern, which many call the “Reproducibility Crisis” in science. Realizing the magnitude of the problem was the first step to suggest a solution which would later be named Code Ocean.

As part of my Ph.D., my scientific efforts were dedicated to exploiting airborne and spaceborne multispectral and hyperspectral images for the purpose of environmental monitoring. This work was a joint effort with leading researchers from the DLR (German Space Agency) and different universities and geological societies around Europe, South Africa and Kyrgyzstan.

Image for post
Image for post

As we got deeper into the project we faced multifaceted complexities. If the code was available on GitHub or any other repository, we came across a variety of common roadblocks one can face when trying to reuse someone else’s code, including:

  • Obtaining the right operating system, programming language, dependencies and their correct versions, before debugging the code.
  • Acquiring hardware to run that code, such as GPUs or a significant amount of memory.
  • Connecting with the researcher or co-author for further possible missing files or dependency code, which can often lead to connecting with one person after the other.
  • Digging deeper into broken links by looking information up in search engines or contacting the original authors.
  • Troubleshooting errors you don’t know how to solve by searching the solution in StackExchange or asking favors of colleagues.
  • Plus, any of those tasks can lead down a rabbit hole or a dead end.

For every article, there could be thousands of readers who will try to reverse engineer the experiment. This is an extreme waste of time of the brightest minds in the world. So why not make all the material available and executable to enable users to independently reproduce the findings? The peer review process in scientific publishing, is meant to enable that upon publishing anyway. It will just take much more time, effort and funds to do it.

Making the code and data available is an important step, but we can also eliminate all the “IT” setup and installation with today’s technology. This will allow researchers to invest their energy in building-upon and moving existing findings forward.

Image for post
Image for post

While there is a lot of material about reproducibility and reuse, I couldn’t find a tangible solution that I could apply easily and effectively. As part of the 2014 cohort of the Runway Startup Postdoc Program at the Jacobs Technion-Cornell Institute, Cornell-Tech NYC, I, together with a founding team quietly developed a solution for the problem for a period of two years. Code Ocean was born.

For the first time, authors of scientific articles can upload their code and data in any open source language, as well as MATLAB and Stata, and link a working computational environment together with the code and the associated article.

Researchers and engineers can change parameters, modify the code, upload their own data, run it again, and see how the results change — without installing anything on a personal computer. Everything runs in the cloud and is easily citable for academic credit with a DOI.

The mission we set when founding Code Ocean was to make the world’s scientific code more open and reproducible. I hope my fellow researchers will find the work we do at Code Ocean beneficial for them to streamline their research activities, share the work with their peers, link it to their publications and reuse it in new and exciting research.


Image for post
Image for post
Simon Adar is the CEO of Code Ocean, a computational reproducibility platform. He holds a Ph.D. from Tel-Aviv University in Hyperspectral image processing. Find him on twitter @SimonAdar

Originally published at codeocean.com.

Code Ocean

The Code Ocean blog sheds light on the researchers behind…

Code Ocean

Written by

codeocean.com

Code Ocean

The Code Ocean blog sheds light on the researchers behind scientific code, reproducibility best practices, and resources to help you in your academic career

Code Ocean

Written by

codeocean.com

Code Ocean

The Code Ocean blog sheds light on the researchers behind scientific code, reproducibility best practices, and resources to help you in your academic career

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store