Research Ideas and Outcomes :
Conference Abstract
|
Corresponding author: Stian Soiland-Reyes (soiland-reyes@manchester.ac.uk)
Received: 24 Aug 2022 | Published: 12 Oct 2022
© 2022 Stian Soiland-Reyes, Peter Sefton, Leyla Jael Castro, Frederik Coppens, Daniel Garijo, Simone Leo, Marc Portier, Paul Groth
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Soiland-Reyes S, Sefton P, Castro LJ, Coppens F, Garijo D, Leo S, Portier M, Groth P (2022) Creating lightweight FAIR Digital Objects with RO-Crate. Research Ideas and Outcomes 8: e93937. https://doi.org/10.3897/rio.8.e93937
|
|
RO-Crate (
The FAIR Digital Object (FDO) approach (
Here we present how we have followed the FDO recommendations and turned research outcomes into FDOs by publishing RO-Crates on the Web using HTTP, following best practices for Linked Data. We highlight challenges and advantages of the FDO approach, and reflect on what is required for an FDO profile to achieve FAIR RO-Crates.
The implementation allows for a broad range of use cases, across scientific domains. A minimal RO-Crate may be represented as a persistent URI resolving to a summary website describing the outputs in a scientific investigation (e.g. https://w3id.org/dgarijo/ro/sepln2022 with links to the used datasets along with software).
One of the advantages of RO-Crates is flexibility, particularly regarding the metadata accompanying the actual research outcome. RO-Crate extends schema.org, a popular vocabulary for describing resources on the Web (
RO-Crates have been combined with machine-actionable Data Management Plans (maDMPs) to automate and facilitate management of research data (
A tailored RO-Crate profile has been defined to represent Electronic Lab Notebooks (ELN) protocols bundled together with metadata and related datasets.
Another example is WorkflowHub (
The workflow profile has been further extended (with OOP-like inheritance) in Workflow Testing RO-Crate, adding formal workflow testing components: this adds operations such as getting remote test instances and test definitions, used by the LifeMonitor service to keep track of the health status of multiple published workflows.
While RO-Crates use Web technologies, they are also self-contained, moving data along with their metadata. This is a powerful construct for interoperability across FAIR repositories, but this raises some challenges with regards to mutability and persistence of crates.
To illustrate how such challenges can be handled, we detail how the WorkflowHub repository follows several FDO principles:
Workflow entries must be frozen for editing and have complete kernel metadata (title, authors, license, description) [FDOF4] before they can be assigned a persistent identifier, e.g. https://doi.org/10.48546/workflowhub.workflow.255.1 [FDOF1]
Computational workflows can be composed of multiple files used as a whole, e.g. CWL files in a GitHub repository. These are snapshotted as a single RO-Crate ZIP, indicating the main workflow. [FDOF11]
PID resolution can content-negotiate to Datacite’s PID metadata [FDOF2] or use FAIR Signposting to find an RO-Crate containing the workflow [FDOF3] and richer JSON-LD metadata resources [FDOF5,FDOF8], see Fig.
Metadata uses schema.org [FDOF7] following the community-developed Bioschemas ComputationalWorkflow profile [FDOF10].
Workflows are discovered using the GA4GH TRS API [FDOF5,FDOF6,FDOF11] and created/modified using CRUD operations [FDOF6]
The RO-Crate profile, effectively the FDO Type [FDOF7], is declared as https://w3id.org/workflowhub/workflow-ro-crate/1.0; the workflow language (e.g. https://w3id.org/workflowhub/workflow-ro-crate#galaxy) is defined in metadata of the main workflow.
FAIR Signposting on a workflow PID (
Further work on RO-Crate profiles include to formalise links to the API operations and repositories [FDOF5,FDOF7], to include PIDs of profiles and types in the FAIR Signposting, and HTTP navigation to individual resources within the RO-Crate.
RO-Crate has shown a broad adoption by communities across many scientific disciplines, providing a lightweight, and therefore easy to adopt, approach to generating FAIR Digital Objects. It is rapidly becoming an integral part of the interoperability fabric between the different components as demonstrated here for WorkflowHub, contributing to building the European Open Science Cloud.
FAIR, research object, linked data, RO-Crate, JSON-LD, FDO, WorkflowHub
Stian Soiland-Reyes
First International Conference on FAIR Digital Objects, poster
We would like to acknowledge the RO-Crate community and the WorkflowHub Club.
Stian Soiland-Reyes is supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement numbers 823830 (BioExcel-2), 824087 (EOSC-Life) and the Horizon Europe programme under grant agreement 101046203 (BY-COVID).
Daniel Garijo is supported by the Madrid Government (Comunidad de Madrid-Spain) under the Multiannual Agreement with Universidad Politécnica de Madrid in the line Support for R&D projects for Beatriz Galindo researchers, in the context of the V PRICIT (Regional Programme of Research and Technological Innovation).
Leyla Jael Garcia is supported by German Research Foundation DFG grant for NFDI4DataScience.
Frederik Coppens is supported by Research Foundation - Flanders (FWO) for ELIXIR Belgium (I002819N).
Author contributions to this article according to the Contributor Roles Taxonomy CASRAI CrEDiT:
Resources described by an RO-Crate are also typed, e.g. Person, Organization, ScholarlyArticle, ImageObject