Research Ideas and Outcomes :
Research Article
|
Corresponding author: Marisa Conte (meese@umich.edu)
Academic editor: Francisco Andres Rivera Quiroz
Received: 10 Jul 2023 | Accepted: 05 Dec 2023 | Published: 18 Dec 2023
© 2023 Marisa Conte, Allen Flynn, Philip Barrison, Peter Boisvert, Zach Landis-Lewis, Charles Friedman
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Conte M, Flynn AJ, Barrison P, Boisvert P, Landis-Lewis Z, Friedman C (2023) Digital objects to make computable biomedical knowledge FAIR: an infrastructural approach to knowledge representation, dissemination and implementation. Research Ideas and Outcomes 9: e109307. https://doi.org/10.3897/rio.9.e109307
|
|
We present our work to develop digital objects to represent and convey a specific category of scientific knowledge: computable biomedical knowledge (CBK). Properly developed, validated, implemented and stewarded, CBK has the potential to accelerate the translation of actionable knowledge from scientific discovery to clinical application.
Our research takes an infrastructural approach to CBK, initially by focusing on the creation of a conceptual model for packaging computable biomedical knowledge - the Knowledge Object (KO) - and on corresponding efforts to create an architecture for KO management and implementation. Additionally, our work is grounded in the FAIR principles, such that KO artefacts should be findable, accessible, interoperable and reusable and we are exploring aligning KOs with emerging best practices for FAIR Digital Objects (FDO).
The outcomes of this work resonate in clinical contexts, health professions education, healthcare quality improvement, biomedical and translational research and population care. Our KO model is also of interest to researchers and practitioners interested in knowledge science, including those working with semantic technologies and other forms of digital objects.
computable biomedical knowledge, digital objects, knowledge objects, metadata, FAIR principles
We present our work to develop digital objects to represent and convey a specific category of scientific knowledge: computable biomedical knowledge (CBK). Computable biomedical knowledge (CBK) is defined as: “the result of an analytic and/or deliberative process about human health or affecting human health, that is explicit and, therefore, can be represented and reasoned upon using logic, formal standards and mathematical approaches” (
Traditional methods of publishing and sharing biomedical knowledge – for example, scholarly scientific papers published in medical journals or knowledge contained in core medical texts – can neither keep pace with the accelerating speed of new knowledge generation (
When properly developed, validated, implemented and stewarded, CBK has the potential to close the gap between scientific discovery of actionable knowledge and its clinical application, a gap often measured in years (
This work is directly related to previous and current efforts to model knowledge in machine-accessible and machine-actionable formats, including digital objects, research objects and semantic publications. With respect to epistemology, in contrast to other classes of data or information, this work emphasises the modelling, symbolic representation and packaging of scientific knowledge. This knowledge has the character of being “accounted semantic information” (
Kahn and Wilensky pioneered the concept of a digital object, defining it as “an instance of an abstract data type that has two components, data and metadata”, in addition to a unique identifier or handle (
Our work is also enhanced by the development and implementation of digital objects engineered for scientific applications. The Research Object (RO) model, which containerises “essential information relating to experiments and investigations”, is particularly relevant as it addresses packaging, publishing and preserving the RO for reuse (
Other related work includes objects to represent experimental or analytic processes. This includes the development of an RDF-encoded, ontology-based model and templates for digital objects describing scientific workflows (
Our work is also related to the development of semantic publication methods, for example, nanopublications or micropublications, which disseminate specific elements from scientific papers in machine-readable and -usable formats. The initial nanopublication model extracts elements from a scientific paper as a machine-readable combination of statement (e.g. a finding or research result) and attribution (information about a statement, for example, author, date, instrumentation), expressed as RDF triples and resolved to a Uniform Resource Identifier (URI) (
Scientific Knowledge Objects (SKO) and micropublications aim to represent the contents of a paper in more useful ways. SKO are a semantic representation of the content of scientific papers, including metadata, methods, results and discussion, with patterns corresponding to deductive, inductive and abductive reasoning (
Our work to package and share CBK has several points of similarity to semantic publications. First, both initiatives recognise that knowledge and information need to be shared in formats that are accessible and actionable by both humans and machines in order to be discovered and used at scale. Semantic publications focus broadly on the extraction and dissemination of machine-usable elements from scientific papers, while our work extracts and makes computer-applicable empirical evidence from clinical practice, studies, guidelines etc. Second, both initiatives recognise the importance of providing information about context, provenance and relationships, in addition to the content itself.
The FAIR principles (
The Knowledge Systems Lab (https://knowledge-systems.lab.medicine.umich.edu/) is a health informatics research group based in the Department of Learning Health Sciences at the University of Michigan’s Medical School. Our foundational work is the development of the Knowledge Object (KO), a packaged artefact conveying various representations of modular and extensible computable biomedical knowledge with corresponding metadata. The KO model is content- and language-agnostic and modular. This modularity increases interoperability and makes it possible to implement KOs as single units of knowledge or combine them for more complex operations.
Since 2016, our lab has developed hundreds of KOs (
The following section describes our KO conceptual model, relevant ontologies and KGrid platform architecture. KGrid materials are freely available under a GPLv.3 licence and include purpose-built collections of KOs (https://kgrid-objects.github.io/), demo projects (https://demo.kgrid.org/) and applications (https://kgrid.org/guides/download/), including the Activator, Library and command line interface.
We understand KOs as having a dual nature: KOs are both resources to be managed and services to be implemented. As a resource, KOs can be curated, stewarded and disseminated. As a service, KOs can be implemented in specific contexts and applied to case data automatically and, therefore, at scale. Both aspects of this nature are represented in our conceptual model (Fig.
Conceptual model of a Knowledge Object (KO) containing a payload, machine-actionable service and deployment specifications, metadata and a unique persistent identifier. We are exploring aligning our conceptual model with emerging best practices for FAIR Digital Objects. Derived from Wittenburg et al's Digital Objects as Drivers towards Convergence in Data Infrastructures (
The KO conceptual model includes:
These elements are packaged in a wrapper containing administrative, descriptive and technical metadata and the package is assigned a unique persistent identifier.
The Figure below (Fig.
KOs containing computable biomedical knowledge are similar in many ways to the digital objects and semantic publishing models mentioned above. Similarities include a shared foundational understanding that the contents of a digital object should be machine-accessible, packaged with metadata and uniquely identified by a resolvable persistent resource identifier. However, there are also significant differences. The greatest difference stems from the tendency of the above-mentioned models to treat the object and its payload (whether data or knowledge) primarily as a resource, while Knowledge Objects containing CBK have essential properties of both a resource and a service. For example, where micropublications include both the representation and argumentation of statements and are meant to augment the scholarly publishing ecosystem, KOs are made to make knowledge machine-actionable, such that it can be implemented in existing infrastructure or stand-alone applications to perform specific tasks (Fig.
This figure illustrates the dual nature of Knowledge Objects: knowledge-as-resource and knowledge-as-service. A KO can be curated and maintained in a repository, pass metadata to a knowledge graph or deployed into applications. Different to other digital objects, the methods to deploy the KO to applications via custom or generic runtimes called by microservices are built into the KO.
Representing KOs with an ontology facilitates machine interpretability. The Knowledge Object Reference Ontology (KORO) (https://bioportal.bioontology.org/ontologies/KORO) is a Basic Formal Ontology-based ontology, which extends the Information Architecture Ontology to formally specify a Knowledge Object. KORO's scope includes what is needed to build compound knowledge objects, implement them and make them FAIR. The ontology defines both the parts of a KO and the relationship between these parts utilising 110 classes and 19 properties, 78 and 4 of which, respectively, are unique to KORO (
KGrid was initially envisioned as an infrastructural platform that not only specified and packaged a knowledge object, but made it findable and accessible and facilitated its application or implementation. The original architecture consisted of two technical infrastructural components: a Library (enabling KO as resource) and an Activator to faciltate deployment of CBK payloads held in KOs (enabling KO as service). The first prototype library included components for standardised metadata and Archival Resource Key (ARK) ID assignment and registration and a gateway to allow for resource discovery and access through APIs (
The first Activator was built in Java, using the Spring Framework and enables the provision of Knowledge-as-a-Service, allowing KOs to be requested and retrieved via an application programming interface (API) call, serialised in JavaScript Object Notation and deployed via webservices (
Early projects demonstrated the ability of KOs containing CBK to activate knowledge as a service at scale. One project addressed medication safety, combining KOs to alert physicians to atypical electronic prescriptions in an effort to minimise prescribing errors (
In addition to demonstration projects, KGrid’s conceptual model and technology has supported the translation of knowledge into practice. KGrid APIs were used to manage and deliver scoring model calculations as part of an app that recommends precision chemotherapy treatments for paediatric neuro-oncology patients (
We continue to expand our understanding of the dual nature of KOs representing knowledge-as-resource and knowledge-as-service. The following section describes current work in four emerging areas of research:
We have recently completed a retrospective analysis of the technology and projects developed during the first five years of the Knowledge Grid platform (2016 - 2021) and lessons learned. One important finding is that, rather than enabling knowledge-as-service, reliance on the original Activator may complicate KO implementation in certain situations. As a result, we have identified a need for technical work to update the original Activator model, as well as development of a specification and reference implementation for activation using different runtimes. The goal of this work is to provide end-users and developers with multiple ways to access and activate KOs, rather than restricting them to any single implementation or workflow. This work is intended to simplify KO integration into clinical workflows. The development of the specification and reference implementation will simplify the method of accessing and deploying KOs through an API call, while providing a software development kit will allow developers to integrate knowledge objects into existing workflows and runtime environments.subsection text.
By engineering and packaging a variety of CBK as KOs, in accordance with the FAIR principles, we are developing a typology of CBK, understanding the unique characteristics of knowledge developed for different purposes, for example, risk prediction, clinical decision support, patient classification etc. This typology, together with the work described above to expand models of activation, will require updates to KORO and KOIO, so that KOs can be adequately described in a standardised way consistent with interoperability.
One example of current work includes defining a class of KO that facilitates patient cohort identification for clinical studies (
We are also interested in the modular nature of KOs, which holds exciting potential for combining different types of CBK for specific purposes. One recent example utilises KOs to prioritise preventative interventions for primary care providers and population health researchers. In the Composite Model for Individualized Precision Prevention (CM-IPP) project, 42 KO submodels were created, in a nested hierarchy, with each submodel representing one preventative medical service recommended by the United States Preventive Task Force (USPTF). At the top level, an executive submodel utilises conditional logic to determine which other submodels should be engaged and these submodels rank preventative service recommendations, based on the patient’s unique characteristics (
Finally, metadata is a primary focus of our research exploring the knowledge-as-resource nature of KOs. Metadata is an essential component of making computable biomedical knowledge FAIR and our work with specific types of CBK includes developing sufficient metadata, such that the artifacts can be discovered, accessed and implemented. This work relates to ongoing efforts within the Mobilizing Computable Biomedical Knowledge (MCBK) community to describe thirteen categories of metadata for computable knowledge
We are also exploring the use of semantic technologies, including linked data to extend our metadata model (
This paper presents our work to develop and demonstrate a conceptual model for packaging computable biomedical knowledge aligned with the FAIR principles, as well as initial efforts to create infrastructural components that can be added to an architecture for CBK management and implementation. Within healthcare and health informatics, this work has immediate relevance to two communities: learning health systems and a growing international community dedicated to Mobilizing Computable Biomedical Knowledge (MCBK) (https://mobilizecbk.med.umich.edu/). The outcomes of this work resonate in clinical contexts, health professions education, healthcare quality improvement, biomedical and translational research and population care. Our model will also be of interest to researchers and practitioners interested in knowledge science, including those working with semantic technologies and other forms of digital knowledge objects.
Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor MI.