Research Ideas and Outcomes :
Conference Abstract
|
Corresponding author: Stian Soiland-Reyes (soiland-reyes@manchester.ac.uk)
Received: 05 Sep 2022 | Published: 12 Oct 2022
© 2022 Stian Soiland-Reyes, Leyla Jael Castro, Daniel Garijo, Marc Portier, Carole Goble, Paul Groth
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Soiland-Reyes S, Castro LJ, Garijo D, Portier M, Goble C, Groth P (2022) Updating Linked Data practices for FAIR Digital Object principles. Research Ideas and Outcomes 8: e94501. https://doi.org/10.3897/rio.8.e94501
|
Background
The FAIR principles (
The European Open Science Cloud (EOSC) has been instrumental in maturing and encouraging FAIR practices across a wide range of research areas. Linked Data in the form of RDF (Resource Description Framework) is the common way to implement machine-readability in FAIR, however the principles do not prescribe RDF or any particular technology (
FAIR Digital Object
FAIR Digital Object (FDO) (
FDO is a set of principles (
More recently, the FDO Forum has prepared detailed recommendations, currently open for comments, including a DOIP endorsement and updated FDO requirements. These point out Linked Data as another possible technology stack, which is the focus of this work.
Linked Data
Linked Data standards (LD), based on the Web architecture, are commonplace in sciences like bioinformatics, chemistry and medical informatics – in particular to publish Open Data as machine-readable resources. LD has become ubiquitous on the general Web, the schema.org vocabulary is used by over 10 million sites for indexing by search engines – 43% of all websites use JSON-LD.
Although LD practices align to FAIR (
Meeting FDO principles using Linked Data standards
Considering the potential of FDOs when combined with the mature technology stack of LD, here we briefly discuss how FDO principles in
However, when considering the specific principles (FDOF1–FDOF12) we find that additional constraints and best practices need to be established – arbitrary LD resources cannot be assumed to follow FDO principles. This is equivalent to how existing use of DOIP is not FDO-compliant without additional constraints.
Namely, persistent identifiers (PIDs) (
While CRUD operations (FDOF6) are supported by native HTTP operations (GET/PUT/POST/DELETE) as in LDP , there is little consistency on how to define operation interfaces in LD (FDOF5). Existing REST approaches like OpenAPI and URI templates are mature and good candidates, and should be related to defined types to support machine-actionable composition (FDOF7). HTTP error code 410 Gone is used in tombstone pages for removed resources (FDOF12), although more frequent is 404 Not Found.
Metadata is resolved to HTTP documents with their own URIs, but these frequently don’t have their own PID (FDOF8). RDF-Star and nanopublications (
Different metadata levels (FDOF9) are frequently developed for LD vocabularies across different communities (FDOF10), such as FHIR for health data, Bioschemas for bioinformatics and >1000 more specific bioontologies. Increased declaration and navigation of profiles is therefore essential for machine-actionability and consistent consumption across FAIR endpoints.
Several standards exist for rich collections (FDOF11), e.g. OAI-ORE, DCAT, RO-Crate, LDP. These are used and extended heterogeneously across the Web, but consistent machine-actionable FDOs will need specific choices of core standards and vocabularies. Another challenge is when multiple PIDs refer to “almost the same” concept in different collections – significant effort have created manual and automated semantic mappings (
Currently the FDO Forum has suggested the use of LDP as a possible alternative for implementing FAIR Digital Objects (
Discussion
The Linked Data stack provides a set of specifications, tools and guidelines in order to help the FDO principles become a reality. This mature approach can accelerate uptake of FDO by scholars and existing research infrastructures such as the European Open Science Cloud (EOSC).
However, the amount of standards and existing metadata vocabularies poses a potential threat for adoption and interoperability. Yet, the challenges for agreeing on usage profiles apply equally to DOIP as LD approaches.
We have worked with different scientific communities to define RO-Crate (
We have also used FAIR Signposting (
We believe that by adopting Linked Data principles, we can accelerate FDO today – and even start building practical ways to assist scientists in efficiently answering topical questions based on knowledge graphs.
FAIR Digital Object, FDO, FAIR, FAIR Signposting, Linked Data, RDF, standards, best practices
Stian Soiland-Reyes
First International Conference on FAIR Digital Objects, presentation
We would like to acknowledge the RO-Crate community and the WorkflowHub Club. Thanks to Rudolf Wittner for valuable comments.
Stian Soiland-Reyes is supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement numbers H2020-INFRAEDI-02-2018 823830 (BioExcel-2), H2020-INFRAEOSC-2018-2 824087 (EOSC-Life) and the Horizon Europe programme under grant agreements HORIZON-INFRA-2021-EMERGENCY-01 101046203 (BY-COVID), HORIZON-INFRA-2021-EOSC-01 101057344 (FAIR-IMPACT).
Leyla Jael Castro is supported by a German Research Foundation DFG grant for NFDI4DataScience.
Daniel Garijo is supported by the Madrid Government (Comunidad de Madrid-Spain) under the Multiannual Agreement with Universidad Politécnica de Madrid in the line Support for R&D projects for Beatriz Galindo researchers, in the context of the V PRICIT (Regional Programme of Research and Technological Innovation)
Author contributions to this article according to the Contributor Roles Taxonomy CASRAI CrEDiT: