visit
Authors:
(1) Vogt, Lars, TIB Leibniz Information Centre for Science and Technology; (2) Konrad, Marcel, TIB Leibniz Information Centre for Science and Technology; (3) Prinz, Manuel, TIB Leibniz Information Centre for Science and Technology.Above, we discussed the role of terms and statement structures (i.e., syntax trees and (meta)data schemata) in reliably communicating the meaning and thus the semantic content of (meta)data statements. Statement structures specify syntactic positions or slots with semantic roles or constraint specifications for a given statement type. To achieve semantic interoperability, we therefore need controlled vocabularies (i.e., ontologies) and ontological and referential term mappings across ontologies for FAIR terms and their terminological interoperability. And we need (meta)data schemata and ontological and referential schema crosswalks for FAIR (meta)data statements and their schematic interoperability.
We also discussed why we think that it is impossible to agree on a best term for every possible type of entity and a best schema for every possible type of statement, due to varying frames of reference and operational priorities. Therefore, we think that we need something like a machine-actionable Rosetta Stone to support the establishment of semantic interoperability across different terms and different schemata for a given type of (meta)data statement. This Rosetta Stone needs to function like an interlingua, with which term mappings and schema crosswalks can be easily specified and operationalized. The building blocks of the interlingua are reference terms, reference datatype specifications, and reference schemata. Each entity type must have specified a corresponding reference term, and each statement type must have a corresponding reference schema. Terms from controlled vocabularies can be mapped to their corresponding reference term, and schemata to their corresponding reference schema. Constraint specifications for slots of reference schemata must refer to reference terms in the case of resources, and to reference datatype specifications in the case of values. These three types of building blocks take over the role of mediating connectors, so that it would no longer be necessary to specify schema crosswalks for every possible pair of schemata of a given type of (meta)data statement and to specify term mappings for every possible pair of terms. This would minimize the number of schema crosswalks and term mappings that need to be specified in order to achieve schematic and terminological interoperability for a given type of statement (Fig. 4).
Ideally, a reference schema is based on a generic Rosetta modeling paradigm that allows the reconstruction of the natural language statement underlying the datum. At the same time, it should document this statement using a formalized structure to ensure its human- and machine-actionability. With respect to human-actionability, the Rosetta modeling paradigm should reflect as closely as possible the structure of natural language statements, favoring lean over complex models, with the aim of reducing overall modeling complexity and modeling burden. Many schemata are very complex and include positions with resources that do not directly align with any input slot (e.g., ‘scalar measurement datum’ and ‘scalar value specification’ in Fig. 2E). Such schemata are not suitable for use as reference schemata.