Why we need to seed blockchain in research: crisis and opportunities for collaboration

Published in

validitylabs

15 min readAug 23, 2018

Decentralized, trustless and properly incentivized solutions are the future of scientific collaboration.

It is likely that you are not a researcher, so here’s a little thought experiment.

You are joining a project, which fascinated you, although it isn’t 100% clear what and how exactly you will be researching. You start off with reading the first paper on the subject and go depth-first through a number of its references until you gain a better understanding of what you are going to build up on. This takes weeks. It is clear you won’t be repeating the entire body of research done on this subject matter; it doesn’t even cross your mind to question the validity of that project. And even if you did, your contract’s duration is very determinate, and you took the job to do something that other human beings haven’t, not what they already have, done before.

You’ve gone to the first conferences to sell your work and make your mark as the person behind X, although it doesn’t really work yet. Perhaps some of the things you’ve found are not exactly as expected; you keep trying until there’s a positive outcome, which is worth publishing. The clock is ticking and the paper counter for this year still says 0; you’ve learnt that it certainly is not a good sign. But you are unruffled because you were ensured there will be enough material to make at least 2 papers out of it.

After a few-month-long back and forth between you and the collaborators, you finally sent the first manuscript to the journal. 2 months later, it has been reviewed by a volunteer researcher, and although it is not a rejection, in your darkest hour you wish you could send this letter back. You take comfort from discussions with your peers, who unanimously agree that the reviews have hardly ever been helpful.

Once you get your first request to do a review of someone else’s paper, you’re flattered and committed to do a good job. But as you do it for free and have tons of work, you anyway deal with this just right before the deadline. You can’t find errors without access to the analysis pipeline and data, so you focus on potential logical fallacies, or pointing out what authors could investigate in addition to what they have already done.

By now it is clear to you how much time of your work is consumed by scientific communication; how many times you probably reinvented the wheel as you work because you were short on resources or received insufficient mentorship/oversight. You might even be short on funding and need to apply for a research grant. You’ve been told that applying for funding is an art; you need to write innovative-looking but safe grants, disguising a part of what you already have done as what you will do.

As you go on in your career, you hear a number of stories that seem surreal. It cannot be true somebody was bullied for not working weekends or taking holidays; manipulated into leaving research for proving irreproducibility of a heavily funded research; or having the credit for their lab results taken by another researcher. The alleged handling of those cases often comes to you as a shocker.

I’m sure some of this sounds familiar to you, even if you are not a researcher yourself. There is a severe inherent problem in any competitive space, described by [-1] in research as “ (…) [obsession] with the prestige points awarded by journals as the means to win jobs, promotion or funding”. I doubt that ‘the winner-takes-it-all’ environment stimulates the dedication of an average salaried Jane or John to strive for truly solid answers to tough questions, or make a leap of faith to push for crazy inventions. And some say the crisis has reached its maximum especially in the era, in which mistakes made by researchers are used as political ammunition [8] to misinform that there’s no global warming, vaccinations are deadly, or the Earth is flat.

Crisis (as in ‘call for action’)

But for a complicated network of researchers, publishers, executives or granting institutions, nobody would benefit from what comes out of the brain of X. Research is highly collaborative and the exchange, as well as trust, are essential to it. As said by Dr. Joris van Rossum in this report [1] about the aspect of communication,

“(…) research depends on an effective exchange of ideas, hypotheses, data, and results(…). Scholarly communication is perceived to be suffering from legacy workflows, outdated publishing paradigms, and business interests that are diametric to the interest of science.”

Let’s spend a few minutes on a number of problems, which call for counter-measures.

                       REPRODUCIBILITY CRISIS

I myself was once not able to reproduce someone else’s results of a numerical experiment, which I intended to build up on. Besides personal frustration, I see it as breaking trust in research and causing an enormous waste of time/resources, particularly of those who don’t realize the error and keep going. I do, however, agree with the statement that

“being at the cutting edge of science means that sometimes results will not be robust. We want to be discovering new things but not generating too many false leads.” [2]

What amplifies the reproducibility crisis beyond the inevitable though is selective outcome reporting and pressure to publish [2], which are the products of competitiveness and wrong incentives. It is then followed by poor analysis, insufficient oversight, or methods/code/raw data being unavailable; aspects that, when made transparent, could create a working self-correcting ecosystem.

Greater tolerance and more efficient infrastructure for self-reporting mistakes [11] would certainly reduce the severity of the problem as well.

             SECRECY, COMPETITION AND CLOSED SCIENCE

Anonymous Academics speak up that performance-driven culture is ruining scientific research [12], leading to “showboat science that under-investigates less eye-catching — but ultimately more useful — areas.” The fierce competition and pressure to publish has already caused mental health damage in academia, with 33% of PhD students being at risk of a common psychiatric disorder [13], attributed in part to not transparent job demands and control, or juggling work-family demands. From the point of view of senior researchers,

“everywhere, supervisors ask PhD students to publish in high-impact journals and acquire external funding before they are ready.” [14]

The pressure to publish in high-impact journals results from the widespread adoption of a wrong assessment method of research impact. DORA [9] (San Francisco Declaration of Research Assessment) made recommendations to “eliminate the use of journal-based metrics, such as Journal Impact Factors, in funding, appointment, and promotion considerations”.

It cautions that

“the Journal Impact Factor, as calculated by Thomson Reuters*, was originally created as a tool to help librarians identify journals to purchase, not as a measure of the scientific quality of research in an article.” [9]

Fierce competition also fuels secrecy: not sharing ideas or data openly in fear of being scooped, not releasing too many details about methods or pipelines post-publication to prevent being called out for own mistakes. The Leiden Manifesto for Research Metric [14] rightly so argues that “transparency enables scrutiny” and among others, calls for

“keeping data collection and analytical processes open, transparent and simple”

and

“allowing those evaluating to verify data and analysis”.

Enabling scrutiny will certainly act as a prevention mechanism for “gaming the system”.

                   REDUNDANCY AND INEFFICIENCY

Other problems calling for improvement are redundancies and inefficiencies in the research pipeline. That includes bias in communication and publishing, as well as issues in peer review and in post-publication peer review.

The bias, coupled to push for accumulating high-impact publications and citations, means that there is a preference to communicate only “positive results”, i.e. those that prove the hypothesis rather than disprove, as those are more likely to be noticed. Negative results are currently viewed as unworthy of publication or mentioning in a publication, to an extent that special journals are needed to accommodate them [5]. The good news is that, if we eliminate that bias and combine it with open sharing of research ideas and pipelines, we will prevent thousands of scientists from reinventing the wheel and repeating redundant experiments. This translates into reduced costs and perhaps more purposeful allocation of funding for research.

Peer review on the other hand, although designed to draw the line between poor and high quality research, doesn’t seem to work as intended [7]. It is rather perceived as a hurdle, as

“the advance of scientific knowledge and progress is significantly slowed down by the process of peer review.”

In its current form, it allows for rejections when there is a conflict of interests or points of view between the researchers. Besides, researchers are encouraged to deal with the review neither diligently, nor timely.

Similarly, the post-publication peer review system is negligent and inefficient. In the article on “a tragedy of errors”, the researchers described their strenuous effort to report substantial errors [11].

“We learned that post-publication peer review is not consistent, smooth or rapid. Many journal editors and staff members seemed unprepared or ill-equipped to investigate, take action or even respond. Too often, the process spiralled through layers of ineffective e-mails among authors, editors and unidentified journal representatives, often without any public statement added to the original article.”

They point out six problems, which perhaps could serve as an inspiration for a new venture.

1. Editors are often unable or reluctant to take speedy and appropriate action.
2. Where to send expressions of concern is unclear.
3. Journals that acknowledged invalidating errors were reluctant to issue retractions.
4. Journals charge authors to correct others’ mistakes.
5. No standard mechanism exists to request raw data.
6. Informal expressions of concern are overlooked.

Finally, I’d argue that there is another aspect to missed opportunities in research communication, which is often overlooked. As we investigate our narrow domains in detached closed boxes, we perceive interactions with peers outside our field as not that meaningful. We currently fail dramatically to leverage technology or available resources to enable interdisciplinary dialogue between different research “species” on other than a political level. But opening such a dialogue would be opening our eyes, particularly to new ideas or applications of existing methods from another field in a new context. Some it would even spare the unwanted publicity and embarrassment [6].

                     OPEN ACCESS OR PAYWALLS

“He looked at his list of abstracts and did the math. Purchasing the papers was going to cost $1000 this week alone — about as much as his monthly living expenses.” [3]

Opening up science nowadays often equals to pirating papers through platforms like SciHub or even Twitter (see #canihaspdf) to circumvent expensive paywalls. Originally genuine need for finding means to propagate scholarly information somehow evolved into monopolistic, most lucrative businesses in the world [4].

We can have open science in the interest of science, or monopolistic paywalls in the interest of someone’s business — I don’t think we can have both. There is a strong need for coming up with a better, more democratic system, perhaps even rewarding the actual researchers for their own work. As I see it, besides giants blocking free access to articles, there is some kind of economic obliviousness in researchers that still puzzles me. Namely, researchers not only get no cut for every article sold, but also have to pay, not get paid, for publishing their work. If J.K. Rowling paid Bloomsburry a few thousand bucks per 20 pages of Harry Potter she wrote, how would that work?

                           BROKEN TRUST

I argued that at the foundation of research is evidence-based scholarly communication, as well as trust. But what if that trust is broken?

From Anonymous Academics, we learn that

“there are many who are so attracted by the prospect of success that they are willing to obfuscate, mystify and perhaps falsify research to game the system and reap the plentiful rewards” [12]

“sadly, students are also vulnerable to the theft of data, ideas and materials; not only by their colleagues, but sometimes by their own supervisor” [15].

Even if this has always been the case, now owing to having a widespread Internet and the democratization of historically strongly hierarchical structures, we finally have a chance to act not just for those affected, but for science as a whole. Those bad practices inflict irreversible damage on the society’s perception of validity of what researchers do. In turn, it affects how much tax payers money goes to science, or even worse — whether the planet will burn up or not. Paradoxically, we need to consider the introduction of trustless solutions in order to earn back the trust.

Opportunities: where blockchain can help (and where not)

http://ec.europa.eu/programmes/horizon2020/en/h2020-section/open-science-open-access

The quality of research, and hence the future of mankind, depends on effective and efficient exchange of ideas, data sets, pipelines, results. Whether you are involved in academic or industry research, we share a common denominator, namely a clear profit from:

making research results more reproducible and sound;
evaluating research impact based on a fair metric, and hence directing funding where its yield is the highest;
preventing redundancies and a waste of human or financial resources;
making use of new technologies to facilitate exchange of expertise not just across geographic locations but also disciplines;
repairing and preventing the damage caused by the cases where the trust is broken.

I will not cover the primer on blockchain technology (see e.g. [16]) but rather focus on the key issues that it helps to alleviate or completely solve. For example, take decentralization, meaning it is designed to be democratic and impossible to control by one party; transparency, i.e. everyone can view what has been pushed to the network; or censorship-resistance, meaning what has been recorded, remains immutable. Additionally, as an exchange of digital good such as a token or coin is its inherent part, it introduces an economical incentive for being a part of the system built on it.

These protocols/technologies cannot stand-alone solve all major problems but may serve as an infrastructure for combining on-chain and off-chain components through right economical incentives. Alternatively, as described in [1], all research pipelines could become “a large, dynamic body of information and data that is collaboratively created, altered, used and shared” with no single owner.

Here are just a few ideas of what this could mean in practice for described problems.

# To be treated as a worksheet- reproducibility - incentivize sharing data, code, methods for a review together with the results; incentivize open post-publication peer review process;
- secrecy and competition - introduce a system tracing and documenting the whole research pipeline for excellence evaluation and grant applications; fund research through "crowdfunding" ICOs (Initial Coin Offerings);
- redundancy and inefficiency - incentivize quick and diligent peer review; de-anonymize it, grow your reputation as a reviewer to claim more responsibility for your actions;
- broken trust - trace and properly attribute who and when collected the data, did the experiment, wrote the manuscript; record publicly working hours to fight exploitation;
- paywalls - replace publishing giants with a platform using decentralized storage, and applying transparent and economically incentivized peer review;

Other ideas for DLTs’ (decentralized ledger technologies) utility in research include replacing the function of patent offices [0], creating a “marketplace for research, where labs or groups specialise in specific aspects of the research workflow”, or avoiding peer pressure by posting anonymous out of the box ideas or hypotheses [1, 17]. There is also hope for adoption of different research metrics, such as anonymously proposed decentralized Academic Endorsement System, which could use AEP tokens (academic endorsement points) to endorse scientific work, adding more value to other research outputs beyond manuscripts such as data, blog posts, code or pipelines [18].

Blockchain-backed projects fostering more innovation in research already begin to surface. To name a few:

Matryx (MTX) allows to post a bounty on a research problem; users work and improve it collectively and when the solution is accepted, the rewards are shared among all contributors.
Covee Network is a platform that helps organize decentralized teamwork without intermediaries and gives users a stake in the project’s success.
Scienceroot wants to create a whole ecosystem consisting of the collaboration, funding and publishing platforms.
Blockchain for Peer Review will develop a protocol backed by a consortium of organizations to make peer review more transparent and trustworthy.
MaterialsZone Team plans to tokenize scientific data and incentivize researchers to openly share their findings.
Pluto wants to give back the governance of scholarly communication to researchers, making publishing and reviewing processes open and transparent.
DEIP will be a decentralized publishing, research funding and reviewing platform.

In addition, what we can all democratically do is take away some of the attention from the prestige, medals and making history in science, and rivet it more to the “we” aspect of research.

Collaboration (where programs like SEED come in)

As as we speak of scientific collaboration, we take the opportunity to leverage blockchain and non-blockchain technologies to support this goal. We are currently gathering necessary resources to enable interdisciplinary dialogue during an unprecedented 5-day and then 3-month program. SEED — a conference and training, think tank, then incubator — is going to be an open source community effort to put together different pieces of a puzzle and sketch new blockchain-backed tools with the goal of improving the quality and pace of research and innovation.

Here are example projects at the intersection of research and industry, which demonstrate the adoption scale of blockchain-backed solutions.

Energy: peer-to-peer energy trading, including empowering communities to control their own energy supply, or to invest directly in renewables [19, 20, 25];
Mobility: autonomous vehicles, ride-sharing systems [21, 22];
Pharma & Healthcare: secure sharing medical records for diagnosis and research; drug supply chain and clinical trial records [23];
Big Data/IoT/Automation: tracking products from producer to consumer; enabling a safe connected home; or distributed spacecraft mission control [25,26,27];
Education & Development: granting refugee identity; academic certifications; preventing corruption; tracking humanitarian aid or research grants [25, 28, 29, 30].

Needless to say, none of those would have ever existed but for the power of interdisciplinary collaboration.

At SEED, there will be room for talks, working on projects, exchanging ideas and solutions, and finally, the implementation of the most sound of them during the incubator. Let me extend that invitation to you — you could be a part of it.

The bright side

People are waking up and that’s a good thing. It’s our responsibility to bring back respect for what researchers do through a strategic simultaneous adoption of democratic, decentralized solutions with proper rewards and credit for our effort, and working incentives for open exchange and “quality control”. Hope you can now see why some are so hopeful of what blockchain infrastructure can provide to ensure this, even though these are only its early days.

If you are at least a bit curious and motivated to be a part of this evolution, then I hope to see you in Davos [-2].

And if in addition you have some money to spare for this project, email us at seed@validitylabs.org.