Research Ideas and Outcomes : Conference Abstract
PDF
Conference Abstract
The Comparative Anatomy of Nanopublications and FAIR Digital Objects
expand article infoErik Anthony Schultes‡,§, Barbara Magagna|, Tobias Kuhn, Marek Suchánek#, Luiz Olavo Bonino da Silva Santos¤, Barend Mons«
‡ Leiden University, Leiden, Netherlands
§ Leiden Center for Data Science, Leiden, Netherlands
| GO FAIR Foundation, Leiden, Netherlands
¶ Vrije Universiteit Amsterdam, Department of Computer Science, Amsterdam, Netherlands
# Czech Technical University in Prague, Faculty of Information Technology, Prague, Czech Republic
¤ University of Twente, Enschede, Netherlands
« Leiden University Medical Center, Leiden, Netherlands
Open Access

Abstract

Beginning in 1995, early Internet pioneers proposed Digital Objects as encapsulations of data and metadata made accessible through persistent identifier resolution services (Kahn and Wilensky 2006). In recent years, this Digital Object Architecture has been extended to include the FAIR Guiding Principles (Wilkinson et al. 2016), resulting in the concept of a FAIR Digital Object (FDO), a minimal, uniform container making any digital resource machine-actionable. Intense effort is currently underway by a global community of experts to clarify definitions around an FDO Framework (FDOF) and to provide technical specifications (FAIR DO group 2020, FAIR Digital Object Forum 2020 , Bonino da Silva Santos (2021)) regarding their potential implementation.

Beginning in 2009, nanopublications were independently conceived (Groth et al. 2010) as a minimal, uniform container making individual semantic assertions and their associated provenance metadata, machine-actionable. They represent minimal units of structured data as citable entities (Mons and Velterop 2009). A nanopublication consists of an assertion, the provenance of the assertion, and the provenance of the nanopublication (publication info). Nanopublications are implemented in and aligned with Semantic Web technologies such as RDF, OWL, and SPARQL (World Wide Web Consortium (W3C) 2015) and can be permanently and uniquely identified using resolvable Trusty URIs (Groth et al. 2021). The existing Nanopublication Server Network provides vital services orchestrating nanopublications (Kuhn et al. 2021) including identifier resolution, storage, search and access. Nanopublications can be used to expose quantitative and qualitative data, as well as hypotheses, claims, negative results, and opinions that are typically unavailable as structured data or go unpublished altogether. The first practical application of nanopublications occurred in 2014, with the publication of millions of nanopublications as part of the FANTOM5 Project (The FANTOM Consortium and the RIKEN PMI and CLST (DGT) 2014, Lizio et al. 2015). Since then, millions of real-world examples spanning diverse knowledge domains are now available on the nanopublication server network.

Like nanopublication, the FDOF also posits an ultra-minimal approach to structured, self-contained, machine-readable data and metadata. An FDO consists of: the object itself (subsequently referred to here as the resource to avoid confusion with other meanings of the term “object”); the metadata describing the resource; and a globally unique and persistent identifier with predictable resolution behaviors.

These two technologies share the same vision of a data infrastructure, and act as instances of Machine-Actionable Containers (MACs) that make use of minimal uniform standards to enable FAIR operations. Here, we compare the structure and computational behaviors of the existing nanopublication infrastructure, to those in the proposed FAIR Digital Object Framework. Although developed independently there are clear parallels between the vision and the approach of nanopublication and FDOF. Each aspires to minimal standards for the encapsulation of digital information into free-standing, publishable (citable, referenceable) entities. The minimal standards involve globally unique and persistent identifiers that resolve to standardized semantically enabled metadata descriptions that include machine actionable paths to the resource itself.

At the same time, there are also differences. The scope of nanopublications is limited to the assertional data type and, as the name suggests, nanopublications should remain small in size (limited to single assertions as individual triples or small RDF graphs). In contrast FDOs are unlimited in their scope, accommodating digital resources of arbitrarily large size, type and complexity, so long as their type can be ontologically described. Furthermore, whereas nanopublications represent a moderately mature technology, the FDOF is a specification still under development. If it were possible to formally draw points of contact between the two approaches, then it would be possible to leverage the vast practical experience gained in the nanopublishing of assertions for the FDO community.

Here, inspired by recent applications of nanopublications in the FIP Wizard tool (Schultes et al. 2020), and their extension to research claims (Kuhn 2022, McNamara 2022) and data using Schultes (2022a), Schultes (2022b), we attempt a point-by-point comparison of the specifications between nanopublication and FDOs. We find a remarkable congruence between the currently proposed FDO requirements and the existing nanopublication infrastructure, including several FDO-like qualities already embodied in the nanopublication ecosystem.

Keywords

FAIR Principles, Nanopublication, Nanopublication Ecosystem, Machine-Actionable Containers, FIP Wizard, FAIR Wizard of Leiden

Presenting author

Erik Anthony Schultes

Presented at

First International Conference on FAIR Digital Objects, presentation

References