Recognizing the Successes of Open Science

Sage Bionetworks
CAOS by Sage Bionetworks
4 min readMar 24, 2019

By C. Titus Brown

Photo by Levi Frey on Unsplash

In my view, open science fundamentally depends on tools, infrastructure and practice for making the research process more open, transparent, and reproducible. Any progress on the open tooling and open practices front (almost) invariably redounds to the larger benefit of open science, and thus science more generally.

So at the recent Critical Assessment of Open Science (CAOS) meeting in New Orleans, I found myself a bit frustrated by the overall mood of doom and gloom. Sure, open science thinking has thus far failed to magically transform the scientific enterprise into a wonderland of openness and collaboration; the negatives of openness are becoming clearer as we explore them; and existing closed systems are surprisingly robust and adaptable in practice. But I think there’s lots of good news, too.

The good news is that, in the last 10 years, we have seen tremendous adoption of openness in scientific communities. For example, the widespread adoption of Jupyter and R notebook technologies means that data analysis workflows are being made explicit in a way that many can understand, share, and remix. Moreover, these open technologies are being incorporated into essentially every data science stack everywhere. Preprints in biology have taken off and there’s no going back. The majority of tools for bioinformatics are now open source. Sites like GitHub, Zenodo, Figshare, and the Open Science Framework, make it trivial to share content, mint DOIs, and openly integrate digital artifacts into the literature. The rise of cloud means that, increasingly, workflows are portable between groups. FAIR data principles have taken off. And the Carpentries training community has spread like wildfire and teaches, as one of its underlying philosophies, more effective sharing through all of the above mechanisms.

But, we don’t really stop and celebrate these wins in the open science community, because we’re relentlessly focused on the next steps. There’s plenty more to be done, and many disappointments and challenges, even with the successful approaches. The relentless academic focus on the unsolved problems prevents us from properly celebrating the amazing achievements that we’ve already got in the bag.

So, stop and smell the roses! Sit back and appreciate our wins, over a beverage of your choice, in a comfortable community space. And start every presentation and workshop with an optimistic statement about what has already worked. I’m not sure how else to best celebrate, but please consider this a call for suggestions.

What’s next?

With full awareness of the irony, I would like to now ask: what’s next? For me, one of the main challenges moving forward is how to more effectively spread the practices above. Scientific practice tends to shift slowly, for good and bad reasons. Can we accelerate adoption of open practices that demonstrably work?

To a large extent, I think adoption of more open practices is just going to happen: data science is an increasingly large, intrinsic part of science, and notebooks make too much sense to ignore. Preprints and open source are, likewise, deeply embedded in some fields and we just need to wait for the obstacles to retire. Sharing mechanisms aren’t going away. Cloud isn’t going away. FAIR is seeing adoption by funding agencies. And the training done by the Carpentries (and friends) seems increasingly likely to become embedded in undergraduate training, because it’s how data science is done.

But there are a lot of methodologies and practices that take a bit of work. For example, at a recent SIAM CSE minisymposium, many of the talks focused on the better ways we already have of working on and with software: we have good techniques for building and supporting software via community engagement, successful business models for long-term research software support, peer code review techniques that work, robust software citation mechanisms, and good continuous integration systems, with improvements on the way.

The main remaining challenge (in my view) is that of adoption: The future is already here — it’s just not evenly distributed. And distributing skills more evenly is hard, as is adapting them to the on-the-ground needs of each scientific community. In my experience, the most effective way of doing this is by developing organic development of communities of practice that adopt and solidify good practice, ultimately making this practice normative within their enclosing scientific communities.

So, what are my main takeaways? I’ll stick with three:

  1. Open has been really successful in ways that, 10 years ago, we would have found hard to believe. Celebrate!
  2. The leading edge of “open” has identified lots of good and effective practice. We should figure out how to spread and solidify this practice broadly, and not just work on the next exciting unsolved problem.
  3. It’s all about communities of practice, maaaan! Invest now! And let’s talk about how to make them more inclusive and welcoming!

Comments welcome,

–titus

Thanks to the CAOS organizers for running a great meeting, to the minisymposium speakers for their great talks, and especially to Daina Bouqin for the enthusiastic discussion about making good software citation behavior normative.

Dr. C. Titus Brown, an Associate Professor at the University of California, Davis. He runs the Data Intensive Biology Lab at UC Davis, where his team tackles questions surrounding biological data analysis, data integration, and data sharing.

Originally published at sagebionetworks.org. This is part of the series: Voices From the Open Science Movement.

--

--

Sage Bionetworks
CAOS by Sage Bionetworks

We develop and apply open practices to data-driven research for the advancement of human health. We are a nonprofit based in Seattle. Visit sagebionetworks.org.