But is it science?

Miquel Duran-Frigola
ersiliaio
Published in
4 min readJan 30, 2022

--

A few weeks ago I received an Honorary Mention at the Research Parasite Awards, held during the Pacific Symposium on Biocomputing. These awards celebrate secondary data analysis, or the generation of new scientific hypotheses from already existing experimental results. Research parasitism consists of reusing data published by others, asking questions that the original investigators did not consider in their work. In a 2016 editorial, the New England Journal of Medicine favored research ‘symbionts’ over research ‘parasites’ (on which I kind of agree) but, in all honesty, I’ve been more of a parasite than a symbiont myself. I almost never had the chance nor the impetus to engage with the primary producers of the data I’ve used in my career. All I’ve done is pull data points from here and there, and then do my best to display them in an enticing way. In this case, the Research Parasite Awards committee recognized the Chemical Checker as an example of sustained and systematic parasitism. Indeed, that was the declared goal of the paper — to streamline the gathering of data that others had produced over years of prolonged effort, with the hope that overlooked properties of existing and candidate drugs would emerge in doing so.

I have mixed feelings about my short trajectory in science. Sometimes I think that secondary data analysis is equally (if not more) important than primary data analysis. One can argue that too often the lifetime of a dataset ends at the day of publication in a scientific journal. It perishes there, in the supplementary materials, underutilized and rarely revisited. Who wants to spend their time on something they don’t owe and is not novel anymore? Likewise, one can argue that secondary data analysis can be a means for researchers with poor or inexistent funding (for instance, those based in Low and Middle Income Countries) to crack through the walls of glamorous journals like Science, Nature and Cell. Sometimes, however, I think otherwise. I wish I had produced the data with my hands, as a result of a proper execution of the scientific method. Look at the world, find a hypothesis, outline an experiment, do the observations that support or falsify the assumption. Not the other way around: collect observations, undo the experiments, find a hypothesis, see if it fits with the world. Even today, when in front of an instance of research parasitism, I often ask myself — “but is it science”?

Fountain by Marcel Duchamp, 1917. Source: Artsy.

This question, I know, mirrors Cynthia Freeland’s But is it art?, a book that for some reason had a great impact on me. Probably it spoke to my way of seeing science at the time, the same way long before I was spoken to by Borges’ parabole Pierre Menard, Author of the Quixote, in which the merit of Menard is to have re-written, line for line, Cervantes’ masterpiece in the 20th Century, yielding a copy of the text that is greater than the original because, when Menard writes, he uses words that have evolved their meaning and gained depth since the time of Cervantes. Similarly, in But is it art? Freelance discusses a series of art pieces that have been regarded by some as vulgar, or of lower merit, simply because, in them, the virtuosity and craftsmanship of the artist is not apparent. A landmark example of this is Marcel Duchamp’s Fountain, an artwork consisting of a urinal allegedly purchased by the artist himself, signed with the name “R. Mutt”, and exhibited in the Grand Central Palace in New York. The history of modern art is full with ready-mades, collages, appropriations and mass reproductions, but the essence was already captured in that urinal in 1917. That thing was art in its own right, because the object does not matter much after all. What matters is how the artist and the public look at the object. In other words, anybody can become an artist as long as they can offer a lens to look at things. No need for skillful manipulation of plastic materials, no need for a pictorial talent to mimetize reality anymore.

I wonder why we, as scientists, so often lag behind humanities in the way we conduct and appreciate our work. It seems to me that the qualities of a ‘valuable’ piece of research are akin to the qualities that were expected from a work of art only until the 19th Century. Nobody in the arts today would disregard an object because it was created in just a few seconds, nor because it does not contribute a novel physical volume to the world, nor because it’s not perfect or presented in a closed form. Why does science demand so much investment and time? Why do most scientific questions seem to need new data to be answered? The truth is that no intellect can cope with the pile of information that is being generated every day — I am hopeful that this will change the status of secondary data analysts. People are talking a lot about ‘real world’ evidence as a means to enhance conventional clinical trials. ‘Real world’ data comes from sources that are independent of the primary research question, and is thus a form of research parasitism. In artificial intelligence, people talk about ‘transfer learning’ as a way to leverage big datasets produced by tech giants, pharma companies and consortia, using them as starting points for more modest, low-data tasks. This is research parasitism too. Data should be there to be used, to be reinterpreted, to be reshaped and to be placed in alien contexts. Data is not there to exist in a static form, as an appendix to support the narrative of an article published on a certain data, on a certain journal by a certain set of authors.

--

--

Miquel Duran-Frigola
ersiliaio

Computational pharmacologist with an interest in global health. Lead Scientist and Founder at Ersilia Open Source Initiative. Occasional fiction writer.