Book review: “Reinventing Discovery: The New Era of Networked Science”
Reinventing Discovery by Michael Nielsen (an excellent writer across the board, IMHO) is a book about the impending revolution in the sciences towards networked and “open” science, and the promise and limitations of collective intelligence.
When I bought this book, I thought it had just been released, and didn’t realize until I cracked it open that it was from seven years ago. This is not a big problem, as the vast majority of the book is still relevant, but some examples from the past few years would have nicely supported some of Nielsen’s points.
These point include (but are not limited to):
- The internet is changing the way science is done, from individualistic and competitive to collaborative.
- Non-scientists can play a part in scientific discovery.
- New rewards (instead of just publications) are needed to incentivize scientists to work “out in the open”; i.e., share their discoveries freely.
In the 17th century, the invention of the scientific journal was a major revolution in the way science was done at the time: instead of keeping research results secret, scientists could now publish them in order to claim the prestige associated with their intellectual property. Other scientists could then subscribe to the journal, for a fee, to stay up to date on findings in their field. In recent times, the pay-to-read scientific journal model has shown some strain. Since much of science is publicly funded, why aren’t the results publicly available? On the flip side, the pay-to-publish, free-to-read model represents a step forward, but still requires (oftentimes large) funds in the possession of the publishing author(s).
Point #1 is exemplified by such services as the arXiv, where pre-prints can be uploaded with zero fees. Such a service allows an author to claim the origination of their ideas, while removing the monetary barriers described above. The only expense associated with the arXiv is in operating its servers, which is secured by “a global collective of institutional members”. For many, this is an improvement over paying for-profit academic publishers for the right to read or publish publicly funded research. However, papers published in prestigious journals often carry much more prestige than only freely available on pre-print servers.
Point #2 is made best by examples about the Galaxy Zoo project and the Foldit game. The former asks amateur astronomers to classify images of galaxies into specific categories (on a volunteer basis); e.g., whether depicted galaxies are elliptical or spiral. The latter frames the question of protein folding as a game, in which contestants compete to find the lowest-energy configuration of a protein, which ultimately help scientists understand the dynamics of protein folding in reality.
In regards to point #3, some new rewards have already been developed since the book’s publication. The Journal of Open Source Software (JOSS), for example, is an academic journal that reviews open source research software and publishes short papers about them, which can be used a form of academic credit when filling out a curriculum vitae or applying for jobs. Reinventing Discovery does an excellent job of listing other such incentives or even existing agreements between scientists to do their work out in the open. An exciting example is GenBank, a publicly available DNA sequence database, and the accompanying commitment of geneticists to freely share all data on the human genome.
However, there are still strong incentives for keeping data, code, and ideas in general secret. These typically boil down to producing publications and patents, in order to further one’s career or deepen one’s pockets.
I liked the author’s writing style for much of the book, but certain chapters felt like they were a simple re-hashing of earlier chapters. The book succeeds best when relaying ideas through anecdotes (about open / networked science), and fails mostly when restating the book’s or the current chapter’s thesis again and again, in slightly different wordings.
Some of my favorite anecdotes were about the Galaxy Zoo, the Polymath Project, Google Flu Trends, and Kasparov vs. the world, mostly because I’d never heard about these before, and also because they neatly illustrated where collective intelligence succeeds or fails. The chapters that revolve around an example or two and relate these back to the book’s thesis were the most interesting to read.
Another big idea in the book is the concept of an open data web, the idea of a network of data residing in / alongside the internet, the aim of which is facilitate efficient access to data which can be shared freely. The structure of this web should increase the prevalence of interdisciplinary science, in that seemingly disparate lines of work can be linked or even merged together (e.g., concepts from control theory and reinforcement learning), simply by having scientists spend the time to develop it. This is a task which must be incentivized in order for scientists to spend their valuable time on it. However, it seems the data web would add enough value (in terms of scientific productivity) to warrant the creation of this new incentive.
I would happily recommend this book, as well as Nielsen’s writings on his blog, or his excellent textbook on quantum computing and information theory (which I haven’t finished yet). His main website is a good place to go to find out about all the projects he has in the works.