Realizing FAIR Digital Objects for the German Helmholtz Association of Research Centres

The Helmholtz Association (Anonymous 2022d), the largest association of large-scale research centres in Germany, covers a wide range of research fields employing more than 43.000 researchers. In 2019, the Helmholtz Metadata Collaboration (HMC) (Anonymous 2022f) Platform as a joint endeavor across all research areas of the Helmholtz Association was started to make the depth and breadth of research data produced by Helmholtz Centres findable, accessible, interoperable, and reusable (FAIR) for the whole science community. To reach this goal, the concept of FAIR Digital Objects (FAIR DOs) has been chosen as top-level commonality for existing and future infrastructures of all research fields. In doing so, HMC follows the original approach of realizing FAIR DOs based on globally unique, Persistent Identifiers (PID), properties before they are sent to an instance of the Typed PID Maker, validated against the Helmholtz Kernel Information Profile, and stored in the record of a newly registered PID using the services of the ePIC consortium. In addition, registered PID records are made searchable via the graphical frontend on top of a search index, e.g., realized using https:// www.elastic.co/. After implementing this generic workflow, additional mappers supporting other repository platforms will be implemented based on the lessons learned, which will lead to a growing number of FAIR DOs and holds potential for providing significant benefits to scientists, e.g., a central point of contact for research data sets stored in different repositories, machine-actionable identification of relevant datasets, and creation of knowledge graphs representing relationships between data sets, repository platforms, researchers and research organizations. Furthermore, the gathered experience and its documentation will help others to apply the FAIR DO concept more easily, which will lead to an ever-growing collection of available FAIR DOs with an increasing quality and level of automation at creation time.


Abstract
The Helmholtz Association (Anonymous 2022d), the largest association of large-scale research centres in Germany, covers a wide range of research fields employing more than 43.000 researchers. In 2019, the Helmholtz Metadata Collaboration (HMC) (Anonymous 2022f) Platform as a joint endeavor across all research areas of the Helmholtz Association was started to make the depth and breadth of research data produced by Helmholtz Centres findable, accessible, interoperable, and reusable (FAIR) for the whole science community. To reach this goal, the concept of FAIR Digital Objects (FAIR DOs) has been chosen as top-level commonality for existing and future infrastructures of all research fields.
In doing so, HMC follows the original approach of realizing FAIR DOs based on globally unique, Persistent Identifiers (PID), e.g., provided by https://handle.net/, machine actionable PID Records and strong typing using Data Types like https://dtrtest.pidconsortium.eu/#objects/21.T11148/1c699a5d1b4ad3ba4956 registered in a Data Type Registry, e.g., http://dtr-test.pidconsortium.eu/. In all these areas, HMC can build on the great groundwork of the Research Data Alliance and the FAIR DO Forum. However, when it comes to realization, there are still some gaps that will have to be addressed during our work and will be raised in this presentation. Currently, a demonstrator is implemented integrating the above components and services, i.e., PID Service, Data Type Registry, and Typed PID Maker. Fig. 1 outlines the architecture overview of the first version of the demonstrator.
In this first version, in a semi-automatic workflow, a user enters a Zenodo (Anonymous 2022a) PID in a graphical Web frontend. A mapping component tries to fill automatically at least the properties required by the Helmholtz Kernel Information Profile using the obtained Zenodo metadata record. In a manual validation loop, the user may add or update certain properties before they are sent to an instance of the Typed PID Maker, validated against the Helmholtz Kernel Information Profile, and stored in the record of a newly registered PID using the services of the ePIC consortium. In addition, registered PID records are made searchable via the graphical frontend on top of a search index, e.g., realized using https:// www.elastic.co/.
After implementing this generic workflow, additional mappers supporting other repository platforms will be implemented based on the lessons learned, which will lead to a growing number of FAIR DOs and holds potential for providing significant benefits to scientists, e.g., a central point of contact for research data sets stored in different repositories, machineactionable identification of relevant datasets, and creation of knowledge graphs representing relationships between data sets, repository platforms, researchers and research organizations.
Furthermore, the gathered experience and its documentation will help others to apply the FAIR DO concept more easily, which will lead to an ever-growing collection of available FAIR DOs with an increasing quality and level of automation at creation time.

Keywords
Helmholtz Metadata Collaboration Platform, Persistent Identifiers, PID Kernel Information Profile, Demonstrator

Architecture of the FAIR DO demonstrator.
Realizing FAIR Digital Objects for the German Helmholtz Association of ...

Presented at
First International Conference on FAIR Digital Objects, presentation