Introducing Records: Tokens for Scientific Creation

Pushing the boundaries of what scientists can do online

LabDAO
6 min readSep 21, 2023

--

Want to get started with records? Create your own at try.labdao.xyz

Since the inception of the web, the scientific community has moved online to share ideas, events, results, data, and code. In many ways, scientists -just like artists- have become online content creators. While the current state of the web enables most activities within a researcher’s daily life, it does not, however, support sensitive activities such as managing funding and intellectual property rights without trusted intermediaries. Where artists found ways to “go direct” and build their own online audiences around their on-chain creations, scientists have not yet been able to do the same. This is finally changing.

Introducing Records

Today we are introducing Records. Records are non-fungible tokens that (1) track scientific work, (2) how it was generated, and (3) the rights attached to it, such as copyright. Every time a researcher uses a computational biology tool on the Lab Exchange, a new Record is created in the background.

An on-chain research artifact must meet the needs of the scientific process. Unlike a free-standing art NFT, a record of scientific work needs to do more than simply point to a single image or article location — it has to facilitate the scientific process:

  • Stand on the shoulders of giants: Researchers should never lose access to their work. The most robust method of ensuring this is to archive data in multiple locations with permissionless access and unique, content-based identifiers. The token creation process must be able to handle hundreds of files — even for very large data.
  • Publish houses of brick, not mansions of straw: Scientific work should be robust and easily modifiable. In order to reproduce work or run modifications at scale the token needs to track not only the generated data but also any inputs used and the detailed computational method that was run.
  • Own your work: Every time a scientist writes a manuscript, creates a notebook or processes a dataset, a creative work is produced. The token needs to give researchers the flexibility to define under what terms they want to share their work and what rights future owners of their tokens might have.

We designed Records with these specifications in mind. After months of work, records can be created on Optimism Testnet.

What you can do with Records

The Lab Exchange is a compute and storage network maintained by LabDAO, designed for scientists to share computational biology tools and, progressively, physical laboratory services. Researchers have already begun to use tools on the Lab Exchange to design new protein binders and perform distributed small molecule docking. Running a tool on the Exchange is as easy as ```plex_run()``` through our Python library. Compute jobs are executed on a distributed, public compute and storage network, which will soon be open for anyone to join with a node. As the Lab Exchange network grows, tools are constantly added and improved.

Every time a researcher uses a computational biology tool on the Lab Exchange, a new Record is created. A record contains three references: (1) a link to any input data, (2) a link to a self-contained tool manifest and container image that generated the results, as well as (3) a link to the output data.

Records are public by default and so is the data that they reference. The owner of a record holds the exclusive and irrevocable rights to the output data and any associated copyrights.

Records make any work that is done on the Lab Exchange permanent, reproducible and available under the terms authors consider best suited for their work.

You can take a look at tutorials and run an example yourself in 3 minutes!

Records Under the hood

Under the hood, Records rely on three protocol layers that power the Lab Exchange:

(1) Distributed Storage
(2) Distributed Computing
(3) Distributed Ownership Guarantees

Distributed Storage
Records reference data using Content Identifiers (CIDs), which can be thought of as similar to a DOI but algorithmically generated and optimized for distributed storage. CIDs uniquely identify and enable the retrieval of files within IPFS, a file-sharing system that uses deterministic hashing based on the content of a file. Instead of a DOI, which requires permission to generate and has a central lookup table, CIDs are permissionless and have a distributed lookup table. Data shared through IPFS can be stored through token-based mechanisms like Filecoin, or conventional (cloud) storage. The first scientific projects have already moved their data to distributed file sharing, with more data on the way.

Distributed Computing
Bacalhau is an open-source protocol that brings computation to IPFS. It allows users to run arbitrary container and WebAssembly (WASM) images as tasks. Even better, the protocol is based on the Compute over data (CoD) architecture where compute jobs are run close to the data they are processing. With everyone being able to connect a compute node to the network, tools that are deployed to the network in one location can be run from any other node. As large data and low reproducibility are the norm in research projects, distributed computing holds a promise for more robust and efficient scientific computing.

Distributed Ownership
Every Record minted results in a permanent on-chain artifact that represents the creative work. The record contains the experiments’ input, the tools used, as well as the output of the experiment. Records give owners transferable and irrevocable rights to the generated data and determine its use. Records and the claims they represent are transferable (if the token is transferred, the right to the data is too) and irrevocable (the license can’t be changed in the future).

Data ownership matters to researchers for scientific credit and intellectual property claims. As a result, we believe that Records can be used to create more transparent mechanisms of academic credit among scientists and serve as future Lego blocks for the interface between scientists and other ecosystems, such as decentralized finance and real-world assets communities.

Perspectives

Over the last months, a shift in the way science can be done online is gaining momentum. Online research funding communities, such as VitaDAO, have raised capital through governance token issuance and have started sponsoring scientific projects — now worth over 4M USD. More of these funding communities are currently in development as well.

Legal wrapper tokens for intellectual property rights, such as IP-NFTs, have moved on-chain in lockstep with the rise of funding communities. In return for financing a researcher within their home organizations, communities receive a tokenised right to any future intellectual property generated within the project.

Creative Artifacts, such as our Record tokens, are being designed to close the loop between supporters, funders and life science researchers in two directions:

(1) Funding & Legal rights
The creative scientific process does not begin with fundraising, it begins with lines of text and code to test an idea. With computational biology capabilities further increasing, we believe the amount of derisking that can be done without spending funds on laboratory experiments will continue increasing. This will change the fundraising process: Competent decision-makers will demand to see some initial computational work before funding a laboratory experiment.

(2) Community & Governance rights
In order to prevail, a research funding community has to do more than write checks. It needs to build its own culture and create added value for researchers to keep attracting the best talent. We believe such added value can include curated benchmark datasets, foundation models and other content — all can be tracked using Records.

Want to get started with records? Create your own at try.labdao.xyz

--

--

LabDAO

The future of biomedical science is an open, community-run network of wet & dry laboratories