Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach

📅 2026-04-24

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Automatically generating high-quality formal ontologies from unstructured text remains challenging, as existing large language model (LLM)-based approaches are often hindered by ambiguous design, structural redundancy, and ineffective repair mechanisms. This work proposes a planning-first, artifact-driven multi-agent paradigm for ontology generation, decomposing the task into a collaborative workflow among four specialized roles: domain expert, manager, coder, and quality assurer. The framework integrates heterogeneous LLM-based review, SPARQL competency assessment, and retrieval-augmented generation to iteratively refine ontological artifacts. Compared to single-agent baselines, the proposed approach substantially improves the structural quality and auditability of generated ontologies while moderately enhancing their query usability.

Technology Category

Application Category

📝 Abstract

Automatically generating formal ontologies from unstructured natural language remains a central challenge in knowledge engineering. While large language models (LLMs) show promise, it remains unclear which architectural design choices drive generation quality and why current approaches fail. We present a controlled experimental study using domain-specific insurance contracts to investigate these questions. We first establish a single-agent LLM baseline, identifying key failure modes such as poor Ontology Design Pattern compliance, structural redundancy, and ineffective iterative repair. We then introduce a multi-agent architecture that decomposes ontology construction into four artifact-driven roles: Domain Expert, Manager, Coder, and Quality Assurer. We evaluate performance across architectural quality (via a panel of heterogeneous LLM judges) and functional usability (via competency question driven SPARQL evaluation with complementary retrieval augmented generation based assessment). Results show that the multi-agent approach significantly improves structural quality and modestly enhances queryability, with gains driven primarily by front-loaded planning. These findings highlight planning-first, artifact-driven generation as a promising and more auditable path toward scalable automated ontology engineering.

Problem

Research questions and friction points this paper is trying to address.

ontology generation

unstructured text

knowledge engineering

large language models

ontology design patterns

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent LLM

ontology generation

artifact-driven design