FAIRware: towards FAIR community metadata practices

Raphael Sonabend
Wellcome Data
Published in
4 min readApr 19, 2022

Written by Dr Michelle Barker, Open Science Consultant, and Mark Musen, Professor of Biomedical Informatics and Director of the Stanford Center for Biomedical Informatics Research, Stanford University.

FAIRware aims to enable researchers to make their research practices more Findable, Accessible, Interoperable and Reusable (FAIR). It is a project that is being carried out by the Stanford Center for Biomedical Informatics Research, as one of the five flagship projects of the Research on Research Institute (RoRI)*.

The FAIRware project is developing the FAIR Workbench, an open-source software that researchers can use to assess their own metadata. This will be made publicly available later this year — and in the meantime, the team are eagerly seeking collaborators who want to test the prototype.

Focusing on FAIR Metadata

FAIRware aims to advance efforts in the research community to increase the FAIRness of research objects, particularly research metadata. Metadata is used to describe data, including aspects like geographic locations and research instruments. Properly describing and documenting data makes it easier for others to find and understand research data.

FAIRware recognises that the only determinant of data FAIRness that investigators have direct control over is the metadata that annotates their data and other research objects. Data becomes FAIR only when they are described with metadata that adheres to community standards. These standards must enumerate the precise attributes of experiments that need to be described in order to make sense of what was done, and adopt controlled, searchable terms that can be compared directly and reliably with those in other metadata records. Without applying such standards throughout the metadata workflow, the corresponding datasets will not be FAIR.

Key benefits of FAIR metadata include increased integrity of research results and the re-use of research outcomes to make new discoveries and to improve responses to societal challenges. A European Commission study found that the annual cost of not having FAIR research data costs the European economy at least €10.2billion every year.

Elements of FAIRware

The FAIRware project consists of three parts, as shown in the figure below:

  1. Offering Metadata for Machines (M4M) workshops, which are facilitated by GO FAIR, to bring together a community to agree on metadata standards for the types of experiments that they perform.
  2. Providing metadata for the Center for Expanded Data Annotation and Retrieval (CEDAR) Workbench to allow scientists to annotate their experimental datasets with metadata that adhere to appropriate standards.
  3. Developing the FAIRware Workbench, which allows researchers to assess the quality of their metadata and offers concrete steps for improving those metadata and their adherence to the community standards.
A diagram showing the three stages of the Fairware project: Metadata for machines workshops, CEDAR workbench and FAIRware Workbench.
FAIRware. CC-BY Marcos Martinez-Romero.

Virtual Fly Brain Case Study: changing community practices

FAIRware aims to enable scientific communities to improve their own practices by creating community-specific FAIR metadata standards, and providing a tool for automated evaluation of the use of those standards. FAIRware is engaging with a range of research communities to collaborate on case studies, including Virtual Fly Brain. Virtual Fly Brain (VFB) is a consortium of neurobiologists who are exploring the detailed neuroanatomy, neuron connectivity, and gene expression of Drosophila melanogaster (often known as the fruit fly). Researchers in the community are using the FAIRware tools to build FAIR practices into their routine scientific workflows, to facilitate improved design, management, and support of FAIR research outputs.

The VFB community is one of many who are using FAIRware to quickly develop the reporting guidelines and controlled terminologies needed to specify standardized metadata, so that they can readily incorporate such standards into their work to ensure the research data the community generates is FAIR from the start.

FAIRware’s innovative approach addresses the varying needs of different research fields. The software supports evaluation of whether the metadata used to describe datasets are “rich” and adhere to community standards — key elements of making data FAIR. As explained by Mark Musen from the Stanford Center for Biomedical Informatics Research, “The tools developed by the FAIRware project use an explicit, editable representation of the reporting guidelines needed to structure metadata for different types of scientific experiments, along with ontologies that can provide standardized values for metadata elements, to simplify the authoring of metadata and to ensure adherence to community standards.”

Next steps

FAIRware has significant potential for improving understanding of how to design, manage, and support FAIR research outputs. The prototype of the FAIRware Workbench will be publicly available in August 2022, and the team is eagerly seeking collaborators who may be interested in using the technology to render their experimental data more FAIR.

For more information, email Mark Musen.

*FAIRware is led by a consortium of five RoRI partners: the Wellcome Trust, the Austrian Science Fund, the Canadian Institutes of Health Research, the National Institute for Health Research (UK), and the Swiss National Science Foundation.

--

--

Raphael Sonabend
Wellcome Data

Technology Manager @ Wellcome. Postdoc @ Imperial and University of Kaiserslautern. PhD in machine learning survival analysis. www.raphaelsonabend.co.uk