Retrieval-Augmented Generation of Ontologies from Relational Databases

📅 2025-06-02
📈 Citations: 0
Influential: 0
📄 PDF

career value

176K/year
🤖 AI Summary
To address the challenges of ontology scarcity, high manual effort, and poor interoperability in transforming relational databases into high-semantic-quality knowledge graphs, this paper proposes RIGOR: a retrieval-augmented, iterative ontology generation framework. RIGOR integrates database schemas, domain ontology repositories, and a dynamically evolving core ontology, leveraging foreign-key-driven iterations and a dual-LLM coordination mechanism—where a generative LLM constructs OWL ontologies and a discriminative LLM ensures semantic consistency and formal verifiability. This enables traceable, incremental, and fully automated ontology engineering. Experiments on real-world databases demonstrate that RIGOR-generated ontologies significantly outperform baseline approaches in accuracy, completeness, and logical consistency, while reducing human intervention by over 70%. The resulting ontologies effectively support semantic interoperability and graph neural network–based reasoning.

Technology Category

Application Category

📝 Abstract
Transforming relational databases into knowledge graphs with enriched ontologies enhances semantic interoperability and unlocks advanced graph-based learning and reasoning over data. However, previous approaches either demand significant manual effort to derive an ontology from a database schema or produce only a basic ontology. We present RIGOR, Retrieval-augmented Iterative Generation of RDB Ontologies, an LLM-driven approach that turns relational schemas into rich OWL ontologies with minimal human effort. RIGOR combines three sources via RAG, the database schema and its documentation, a repository of domain ontologies, and a growing core ontology, to prompt a generative LLM for producing successive, provenance-tagged delta ontology fragments. Each fragment is refined by a judge-LLM before being merged into the core ontology, and the process iterates table-by-table following foreign key constraints until coverage is complete. Applied to real-world databases, our approach outputs ontologies that score highly on standard quality dimensions such as accuracy, completeness, conciseness, adaptability, clarity, and consistency, while substantially reducing manual effort.
Problem

Research questions and friction points this paper is trying to address.

Automating ontology generation from relational databases
Enhancing semantic interoperability with enriched ontologies
Reducing manual effort in creating comprehensive OWL ontologies
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-driven RDB to OWL ontology conversion
RAG combines schema, docs, domain ontologies
Iterative judge-LLM refinement of delta fragments
M
M. Nayyeri
Institute for Artificial Intelligence, University of Stuttgart, Stuttgart, Germany
A
Athish A Yogi
Institute for Artificial Intelligence, University of Stuttgart, Stuttgart, Germany
N
Nadeen Fathallah
Institute for Artificial Intelligence, University of Stuttgart, Stuttgart, Germany
Ratan Bahadur Thapa
Ratan Bahadur Thapa
Postdoc at the Institute for Artificial Intelligence, University of Stuttgart
Formal MethodsSW Technology
H
H. Tautenhahn
Universitätsklinikum Leipzig, Leipzig, Germany
A
Anton Schnurpel
Universitätsklinikum Leipzig, Leipzig, Germany
Steffen Staab
Steffen Staab
WAIS, University of Southampton, UK & Analytic Computing, Universität Stuttgart, DE
Artificial IntelligenceKnowledge GraphsSimulation ScienceIntelligent User InterfacesWeb Sci