Ontology-based knowledge graph infrastructure for interoperable atomistic simulation data

📅 2026-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited reusability of atomistic simulation data, which stems from heterogeneous formats, missing metadata, and the absence of standardized workflows. To overcome these challenges, the authors propose the first infrastructure that integrates domain ontologies with knowledge graphs to enable semantic standardization of data, machine-readable representation of computational workflows, and partial reproducibility of simulation processes. By leveraging ontology-based modeling, metadata specifications, and provenance tracking, they construct a knowledge graph comprising over 750,000 triples and integrating nearly 8,000 computational samples. This framework supports consistent cross-dataset querying, grain boundary integration, materials property analysis, and thermodynamic quantity extraction, thereby significantly enhancing data discoverability, interoperability, and reusability.
📝 Abstract
The reuse of atomistic simulation data is often limited by heterogeneous formats, incomplete metadata, and a lack of standardized representations of workflows and provenance. Here we present an ontology-based infrastructure for representing and integrating atomistic simulation data as a knowledge graph. The approach combines domain ontologies with a software framework that enables data capture both from existing datasets and directly from simulation workflows at the point of generation. Heterogeneous data from multiple sources are normalized into a common, ontology-aligned representation, enabling consistent querying and analysis across datasets. We demonstrate these capabilities through the integration of grain boundary data, cross-dataset analysis of material properties, and extraction of derived thermodynamic quantities from existing simulations. In addition, workflows are represented in a machine-readable form, enabling both forward provenance tracking and partial reconstruction of computational procedures. The resulting knowledge graph contains over 750,000 triples describing nearly 8,000 computational samples. This work provides a practical framework for improving the findability, interoperability, and reuse of atomistic simulation data.
Problem

Research questions and friction points this paper is trying to address.

atomistic simulation data
interoperability
metadata
provenance
data reuse
Innovation

Methods, ideas, or system contributions that make the work stand out.

ontology
knowledge graph
atomistic simulation
interoperability
provenance tracking
🔎 Similar Papers
No similar papers found.
A
Abril Azocar Guzman
Institute for Advanced Simulations – Materials Data Science and Informatics (IAS-9), Forschungszentrum Jülich GmbH, Jülich, Germany
S
Sarath Menon
Interdisciplinary Centre for Advanced Materials Simulation (ICAMS), Ruhr University Bochum, Bochum, Germany
Tilmann Hickel
Tilmann Hickel
BAM Federal Institute for Materials Research and Testing
Materials Informatics
S
Stefan Sandfeld
Institute for Advanced Simulations – Materials Data Science and Informatics (IAS-9), Forschungszentrum Jülich GmbH, Jülich, Germany