🤖 AI Summary
To address the high barrier to FAIR knowledge graph (KG) construction caused by domain experts’ limited semantic modeling capabilities, this paper proposes Rosetta Statement—a lightweight metamodeling approach that uses natural-language sentences as atomic modeling units, decoupling semantic modeling from ontology engineering. It supports sentence-pattern versioning, dynamic natural-language rendering and editing, and traceable provenance, enabling ontology-agnostic schema definition on the Open Research Knowledge Graph (ORKG) platform. Innovatively integrating Wikidata term mapping, sentence-pattern metamodeling, and dynamic label generation, Rosetta Statement is extended with LLM support for sentence-driven data entry, visualization, and SPARQL/Cypher-free querying. Empirical validation on ORKG demonstrates that domain experts can independently author structured schemas, enabling a three-phase KG construction workflow and significantly improving modeling efficiency, cognitive interoperability, and FAIR compliance—establishing a low-barrier, high-usability KG construction paradigm.
📝 Abstract
Machines need data and metadata to be machine-actionable and FAIR (findable, accessible, interoperable, reusable) to manage increasing data volumes. Knowledge graphs and ontologies are key to this, but their use is hampered by high access barriers due to required prior knowledge in semantics and data modelling. The Rosetta Statement approach proposes modeling English natural language statements instead of a mind-independent reality. We propose a metamodel for creating semantic schema patterns for simple statement types. The approach supports versioning of statements and provides a detailed editing history. Each Rosetta Statement pattern has a dynamic label for displaying statements as natural language sentences. Implemented in the Open Research Knowledge Graph (ORKG) as a use case, this approach allows domain experts to define data schema patterns without needing semantic knowledge. Future plans include combining Rosetta Statements with semantic units to organize ORKG into meaningful subgraphs, improving usability. A search interface for querying statements without needing SPARQL or Cypher knowledge is also planned, along with tools for data entry and display using Large Language Models. The Rosetta Statement metamodel supports a three-step knowledge graph construction procedure. Domain experts can model semantic content without support from ontology engineers by using Wikidata, lowering entry barriers and increasing cognitive interoperability. The second level involves mapping Wikidata terms to established ontologies, and the third step developing semantic graph patterns for reasoning, requiring collaboration with ontology engineers.