Linked Metadata for FAIR Digital Objects Carrying Computable Knowledge

Introduction To advance the goals of the Mobilizing Computable Biomedical Knowledge (MCBK) Movement, we are exploring the use of First, we are beginning to clarify the full range of metadata for FDOs that carry bit sequences expressing knowledge in machine readable or executable formats . We view knowledge through an empirical lens as the reliable, valid, and valued results of analytic or deliberative data analysis. Computability of knowledge refers to the degree to which knowledge is formally represented for use by computing machines. Second, we are figuring out how to apply linked data principles to FDO metadata records (Bizer et al. (2008)). Linked data are structured data with openly defined and uniquely identified concepts. We are developing linked metadata that conform to the Resource Description Format (RDF), where domains of interest are represented using a pattern of subject-predicate-object “triples.” RDF triples give rise to machine actionable FDO metadata records that can be visualized as directed graphs . For FDOs containing computable knowledge to have high-degrees of FAIRness, extensive metadata records are required. Some metadata content specified to date is specific to this type of FDO and payload. It is possible to represent FDO metadata as linked metadata, making the metadata richer semantically and potentially easier to manage with artificial agents and machines. In biomedicine especially, more work is needed to identify more vocabularies for use as controlled terminologies to arrive at suitably comprehensive linked metadata for this important new type of FDO.


Introduction
To advance the goals of the Mobilizing Computable Biomedical Knowledge (MCBK) Movement, we are exploring the use of FAIR Digital Objects (FDOs) (De Smedt et al. 2020, Williams et al. 2021.
First, we are beginning to clarify the full range of metadata for FDOs that carry bit sequences expressing knowledge in machine readable or executable formats. We view knowledge through an empirical lens as the reliable, valid, and valued results of analytic or deliberative data analysis. Computability of knowledge refers to the degree to which knowledge is formally represented for use by computing machines.
Second, we are figuring out how to apply linked data principles to FDO metadata records (Bizer et al. (2008)). Linked data are structured data with openly defined and uniquely identified concepts. We are developing linked metadata that conform to the Resource Description Format (RDF), where domains of interest are represented using a pattern of subject-predicate-object "triples." RDF triples give rise to machine actionable FDO metadata records that can be visualized as directed graphs.
In keeping with the FAIR Digital Object Framework (FDOF), we value linked metadata as a general method of bringing consistency to FDO metadata records, making it so that artificial agents can act on them in predictable ways. Five other benefits of linked metadata are that they are divisible, aggregable, extensible, queryable (using SPARQL), and support logical inferencing.
With a focus specifically on FDOs that carry computable knowledge artifacts at their core, here we present our recent metadata work completed between 2019 and mid-2022.

Metadata Scope for FDOs Carrying Computable Knowledge
This section summarizes previously published work to specify and scope FDO metadata. This work was completed by members of our team and the larger MCBK Movement. Through many dialogs over a period of more than a year, thirteen high-level categories of metadata for FDOs carrying computable knowledge were described (Alper et al. (2021)). These categories are listed in Table 1  Next, we briefly discuss six categories marked with an asterisk (*) in Table 1. These six categories are somewhat specific to FDOs that contain computable knowledge.
For Knowledge Domain metadata, a large and growing number of biomedical vocabularies or schema exist. For clinical terms, the Standardized Nomenclature of Medicine (SNOMED ) includes more than 350K RDF classes and 200 properties. Many bioscience vocabularies spanning a wide range of terms from human biology also exist.
Purpose metadata are critical for FDOs that convey computable knowledge about the prevention, diagnosis, treatment, amelioration, and monitoring of disease. Interestingly, we have yet to find vocabularies for representing clinically-oriented FDO purposes as linked metadata.
We anticipate needing FDO-to-FDO Relation metadata. Going beyond citations that relate knowledge to its antecedents, FDOs containing computable biomedical knowledge may

Linked Metadata for actual FDOs Carrying Computable Knowledge
This section shares new work. Since 2016, we have built and tested several hundred compound Digital Objects (DOs) carrying executable biomedical knowledge in the form of pure functions (e.g., math functions for estimating a health risk) (Beck et al. 2022). Our particular DOs -called Knowledge Objects (KOs) -conform to a common design pattern we created (Fig. 1). We have demonstrated how these DOs can be rapidly implemented in Figure 1.

This figure depicts the parts of a type of DOs called Knowledge Objects (KOs). The core of the KO is a bit sequence encoding some machine processable knowledge. This core is referred to as the KO's payload. For all KOs, the payload can be deployed automatically on the web as a webservice by software tools that act on the KOs Deployment and Service Descriptions. The KO and its payload are described by metadata of different kinds. The KO has a persistent identifier (PID) that facilities gaining access to its components
several technical environments to enable RESTful webservice requests and responses to and from pure functions of interest in biomedicine.
In a move towards having a specific type of FDOs for carrying computable knowledge, we have started the process of developing linked metadata records for FDOs using a prototype metadata schema. An example of an early FDO linked data record appears in Example 1.

Example 1. An FDO linked metadata record iin JSON-LD format. (Cut and paste into the JSON-LD Playground to visualize.)
The KO described in the linked metadata record above is available here for inspection. As Example 1 shows in bold text, our initial prototype linked metadata record for KOs relies on three vocabularies, Dublin Core Terms, the Function Ontology, and our own Knowledge Object Implementation Ontology (KOIO). As its FDO identifier, the KO uses an Archival Resource Key (ARK). ARKs are attractive because they support a suffix passthrough mechanism for consistently identifying the common parts of a KO, such as Deployment and Service Descriptions. This linked metadata record in Example 1 has been successfully loaded into several RDF systems, including the JSON-LD Playground and an instance of the Blue Brain Nexus knowledge graph system. We have used SPARQL queries to extract and filter elements from this linked metadata record.

Conclusion
For FDOs containing computable knowledge to have high-degrees of FAIRness, extensive metadata records are required. Some metadata content specified to date is specific to this type of FDO and payload. It is possible to represent FDO metadata as linked metadata, making the metadata richer semantically and potentially easier to manage with artificial agents and machines. In biomedicine especially, more work is needed to identify more vocabularies for use as controlled terminologies to arrive at suitably comprehensive linked metadata for this important new type of FDO.