๐ค AI Summary
This study addresses semantic interoperability challenges arising from inconsistent semantic representations of the AnIML standard across heterogeneous experimental data systems. To resolve this, the work presents the first formalization of the AnIML standard as an OWL 2 ontology, aligned with the Allotrope Data Format. A collaborative ontology engineering methodology is introduced, integrating domain expert input with large language modelโassisted requirements elicitation. Furthermore, a novel validation protocol is devised, leveraging adversarial negative capability queries and SHACL constraints to ensure ontological rigor. The approach successfully transforms real-world AnIML files into a knowledge graph, enabling SPARQL-based querying and knowledge graph transformation techniques that effectively preserve ontological consistency and facilitate cross-system interoperability.
๐ Abstract
Achieving semantic interoperability across heterogeneous experimental data systems remains a major barrier to data-driven scientific discovery. The Analytical Information Markup Language (AnIML), a flexible XML-based standard for analytical chemistry and biology, is increasingly used in industrial R&D labs for managing and exchanging experimental data. However, the expressivity of the XML schema permits divergent interpretations across stakeholders, introducing inconsistencies that undermine the interoperability the AnIML schema was designed to support. In this paper, we present the AnIML Ontology, an OWL 2 ontology that formalises the semantics of AnIML and aligns it with the Allotrope Data Format to support future cross-system and cross-lab interoperability. The ontology was developed using an expert-in-the-loop approach combining LLM-assisted requirement elicitation with collaborative ontology engineering. We validate the ontology through a multi-layered approach: data-driven transformation of real-world AnIML files into knowledge graphs, competency question verification via SPARQL, and a novel validation protocol based on adversarial negative competency questions mapped to established ontological anti-patterns and enforced via SHACL constraints.