🤖 AI Summary
This work addresses the resource-intensive bottleneck in ontology engineering caused by poor communication between domain experts and ontology engineers during competency question (CQ) elicitation. The authors propose a semi-automated workflow that integrates large language models (LLMs) into an expert-in-the-loop process: initial CQs are automatically generated from requirement documents, then iteratively refined by the LLM based on expert feedback provided through an interactive platform until consensus is reached. This approach introduces, for the first time, an iterative CQ generation mechanism combining LLM capabilities with expert collaboration, complemented by a provenance model that fully traces the CQ lifecycle to ensure transparency and reproducibility. Evaluations in real-world scientific data and cultural heritage scenarios demonstrate that the method significantly accelerates requirements engineering and enhances CQ acceptability, relevance, and usability for domain experts.
📝 Abstract
Competency question (CQ) elicitation represents a critical but resource-intensive bottleneck in ontology engineering. This foundational phase is often hampered by the communication gap between domain experts, who possess the necessary knowledge, and ontology engineers, who formalise it. This paper introduces IDEA2, a novel, semi-automated workflow that integrates Large Language Models (LLMs) within a collaborative, expert-in-the-loop process to address this challenge. The methodology is characterised by a core iterative loop: an initial LLM-based extraction of CQs from requirement documents, a co-creational review and feedback phase by domain experts on an accessible collaborative platform, and an iterative, feedback-driven reformulation of rejected CQs by an LLM until consensus is achieved. To ensure transparency and reproducibility, the entire lifecycle of each CQ is tracked using a provenance model that captures the full lineage of edits, anonymised feedback, and generation parameters. The workflow was validated in 2 real-world scenarios (scientific data, cultural heritage), demonstrating that IDEA2 can accelerate the requirements engineering process, improve the acceptance and relevance of the resulting CQs, and exhibit high usability and effectiveness among domain experts. We release all code and experiments at https://github.com/KE-UniLiv/IDEA2