Challenges for FAIR Digital Object Assessment

Esteban Gonzalez; Daniel Garijo; Oscar Corcho

doi:10.3897/rio.8.e95943

Research Ideas and Outcomes : Conference Abstract

PDF

Conference Abstract

Challenges for FAIR Digital Object Assessment

Esteban Gonzalez^‡, Daniel Garijo^‡, Oscar Corcho^‡

‡ Universidad Politécnica de Madrid, Madrid, Spain

Corresponding author: Esteban Gonzalez (esteban.gonzalez@upm.es), Daniel Garijo (daniel.garijo@upm.es), Oscar Corcho (ocorcho@fi.upm.es)

Received: 03 Oct 2022 | Published: 12 Oct 2022

This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Citation: Gonzalez E, Garijo D, Corcho O (2022) Challenges for FAIR Digital Object Assessment. Research Ideas and Outcomes 8: e95943. https://doi.org/10.3897/rio.8.e95943

Abstract

A Digital Object (DO) "is a sequence of bits, incorporating a work or portion of a work or other information in which a party has rights or interests, or in which there is value". DOs should have persistent identifiers, meta-data and be readable by both humans and machines. A FAIR Digital Object is a DO able to interact with automated data processing systems (De Smedt et al. 2020) while following the FAIR (Findable, Accessible, Interoperable and Reusable principles) principles (Wilkinson et al. 2016).

Although FAIR was originally targeted towards data artifacts, new initiatives have emerged to adapt other research digital resources such as software (Katz et al. 2021) (Lamprecht et al. 2020), ontologies (Poveda-Villalón et al. 2020), virtual research environments and even DOs (Collins et al. 2018). In this paper, we describe the challenges when assessing the level of compliance of a DO with the FAIR principles (i.e., its FAIRness), assuming that a DO contains multiple resources and captures their relationships. We explore different methods to calculate an evaluation score, and we discuss the challeneges and importance of providing explanations and guidelines for authors.

FAIR assessment tools

There are a growing number of tools used to assess the FAIRness of DOs. Community groups like FAIRassist.org have compiled lists of guidelines and tools for assessing the FAIRness of digital resources. These range from self assessment tools like questionnaires and checklists to semi-automated validators (Devaraju et al. 2021). Examples of automated validation tools include the F-UJI Automated FAIR Data Assessment Tool (Devaraju and Huber 2020), FAIR Evaluator and FAIR Checker for datasets or individual DOs; HowFairIs (Spaaks et al. 2021) for code repositories; and and FOOPS (Garijo et al. 2021) to assess ontologies.

When it comes to assessing FDOs, we find two main challenges:

Resource score discrepancy: Different FAIR assessment tools for the same type of resource produce different scores. For example, a recent study over datasets showcases differences in scores for the same resource due to how the FAIR principles are interpreted by different authors (Dumontier 2022).
Heterogeneous FDO metadata: Validators include tests that explore metadata of the digital object. However, there is no agreed metadata schema to represent FDO metadata, which complicates this operation. In addition, metadata may be specific to a certain domain (De Smedt et al. 2020). To address this challenge, we need i) to agree on minimum common set of metadata to measure the FAIRness of DOs and ii) propose a framework to describe extensions for specialized digital objects (datasets, software, ontologies, VRE, etc.).

In (Wilkinson et al. 2019), the authors propose a community-driven framework to assess the FAIRness of individual digital objects. This framework is based on:

a collection of maturity indicators,
principle compliance tests, and
a module to apply those tests to digital resources.

The proposed indicators may be a starting point to define which tests are needed for each type of resource (de Miranda Azevedo and Dumontier 2020).

Aggregation of FAIR metrics

Another challenge is the best way to produce an assessment score for a FDO, independently of the tests that are run to assess it. For example, each of the four dimensions of FAIR (Findable, Accessible, Interoperable and Reusable) usually have a different number of associated assessment tests. If the final score is presented based on the number of tests, then by default some dimensions may have more importance than others. Similarly, not all tests may have the same importance for some specific resources (e.g., in some cases having a license in a resource may be considered more important than having its full documentation).

In our work we consider a FDO as an aggregation of resources, and therefore we face the additional challenge of creating an aggregated FAIRness score for the whole FDO. We consider the following aggregation scores:

Global score: calculated by formula (see Fig. 1-1). It represents the percentage of total passed tests. It doesn’t take into account the principle to which a test belongs.
FAIR average score: calculated by formula (see Fig. 1- 2). It represents the average of the passed tests ratios for each principle plus the ratio of passed tests used to evaluate the Research Object itself.

Figure 1.

Possible formulas for determining an aggregated FAIRness score for the entire FDO.

Both metrics are agnostic to the kind of resource analyzed. The score they produce ranges from [0 - 100].

Discussion

A FDO has metadata records that describe it. Some records are common for all DOs, and others are specific to a DO. This makes it difficult to assess some FAIR principles like "F2: "data are described with rich metadata". Therefore, we believe a discussion of a minimal set of FAIR metadata should be addressed by the community.

In addition, a FAIR assessment score can change significantly depending on the formula used for aggregating all metrics. Therefore, it is key to explain to users the method and provenance used to produce such score. Different communities should agree on the best scoring mechanism for their FDOs, e.g., by adding a weight to each principle and figuring out the right number of tests for each principle, which may give more importance to the principles with tests.

We believe that the objective of a FAIR scoring system should not be to produce a ranking, but become a mechanism to improve the FAIRness of a FDO.

Keywords

FAIR assessment, FDO, Digital Objects

Presenting author

Esteban González

Presented at

First International Conference on FAIR Digital Objects, presentation

Acknowledgements

Funding program

European Comission - Project FAIR IMPACT

Grant title

Hosting institution

Ethics and security

Author contributions

Conflicts of interest

References

Collins S, Genova F, Harrower N, Hodson S, Jones S, Laaksonen L, Wittenburg P (2018)

Turning FAIR into reality: Final report and action plan from the European Commission expert group on FAIR data.

de Miranda Azevedo R, Dumontier M (2020)

Considerations for the Conduction and Interpretation of FAIRness Evaluations

Data Intelligence

285

‑

292

. https://doi.org/10.1162/dint_a_00051

De Smedt K, Koureas D, Wittenburg P (2020)

FAIR Digital Objects for Science: From Data Pieces to Actionable Knowledge Units

Publications

(

). https://doi.org/10.3390/publications8020021

Devaraju A, Huber R (2020)

F-uji - an automated fair data assessment tool

. https://doi.org/10.5281/zenodo.4063720

Devaraju A, Mokrane M, Cepinskas L, Huber R, Herterich P, Vries J, Diepenbroek M, (2021)

From conceptualization to implementation: FAIR assessment of research data objects

Data Science Journal

(

‑

Dumontier M (2022)

A comprehensive comparison of automated fairness evaluation tools

. URL: http://ceur-ws.org/Vol-3127/paper-6.pd

Garijo D, Corcho O, P: (2021)

FOOPS!: An ontology pitfall scan- ner for the fair principles.

URL: http://ceur-ws.org/Vol- 2980/paper321.pdf

Katz D, Gruenpeter M, Honeyman T (2021)

Taking a fresh look at FAIR for research software

Patterns

(

). https://doi.org/10.1016/j.patter.2021.100222

Lamprecht A, Garcia L, Kuzak M, Martinez C, Arcila R, Martin Del Pico E, Dominguez Del Angel V, van de Sandt S, Ison J, Martinez PA, McQuilton P, Valencia A, Harrow J, Psomopoulos F, Gelpi J, Chue Hong N, Goble C, Capella-Gutierrez S (2020)

Towards FAIR principles for research software

Data Science

(

‑

. https://doi.org/10.3233/ds-190026

Poveda-Villalón M, Espinoza-Arias P, Garijo D, Corcho O (2020)

Coming to Terms with FAIR Ontologies

Lecture Notes in Computer Science

255

‑

270

. https://doi.org/10.1007/978-3-030-61244-3_18

Spaaks JH, Kuzak M, Martinez-Ortiz C, van Werkhoven B, Etuk E, Saladi S, Holding A, Tjong Kim Sang E, Diblen F, Verho- even S (2021)

howfairis

. https://doi.org/10.5281/zenodo.4591110

Wilkinson M, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten J, da Silva Santos LB, Bourne P, Bouwman J, Brookes A, Clark T, Crosas M, Dillo I, Dumon O, Edmunds S, Evelo C, Finkers R, Gonzalez-Beltran A, Gray AG, Groth P, Goble C, Grethe J, Heringa J, ’t Hoen PC, Hooft R, Kuhn T, Kok R, Kok J, Lusher S, Martone M, Mons A, Packer A, Persson B, Rocca-Serra P, Roos M, van Schaik R, Sansone S, Schultes E, Sengstag T, Slater T, Strawn G, Swertz M, Thompson M, van der Lei J, van Mulligen E, Velterop J, Waagmeester A, Wittenburg P, Wolstencroft K, Zhao J, Mons B (2016)

The FAIR Guiding Principles for scientific data management and stewardship

Scientific Data

(

). https://doi.org/10.1038/sdata.2016.18

Wilkinson MD, Dumontier M, Sansone S, da Silva Santos LOB, Prieto M, Batista D, McQuilton P, Kuhn T, Rocca-Serra P, Crosas M, Schultes E (2019)

Evaluating FAIR Maturity Through a Scalable, Automated, Community-Governed Framework

bioRxiv

https://doi.org/10.1101/649202

Supplementary material

Endnotes