A Comparative Study of Competency Question Elicitation Methods from Ontology Requirements

📅 2025-07-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Systematic comparative evaluation of competency question (CQ) acquisition methods in ontology engineering remains lacking. This paper introduces the first homologous, multi-annotator CQ dataset and empirically compares three CQ generation approaches—manual construction, ontology schema-based template instantiation, and large language model (LLM)-based auto-generation—across five dimensions: acceptability, ambiguity, relevance, readability, and complexity, with multi-rater scoring and inter-annotator agreement analysis. Key contributions include: (1) the first cross-method, homologous, and controllable CQ evaluation framework; (2) empirical evidence that LLMs excel in efficiency and lexical diversity for initial CQ generation but suffer from strong model dependency, high ambiguity rates, and require substantial human refinement; and (3) validation that template-based methods offer superior controllability and consistency. These findings provide evidence-based guidance for selecting CQ acquisition methods and integrating LLMs into ontology modeling practice.

Technology Category

Application Category

📝 Abstract
Competency Questions (CQs) are pivotal in knowledge engineering, guiding the design, validation, and testing of ontologies. A number of diverse formulation approaches have been proposed in the literature, ranging from completely manual to Large Language Model (LLM) driven ones. However, attempts to characterise the outputs of these approaches and their systematic comparison are scarce. This paper presents an empirical comparative evaluation of three distinct CQ formulation approaches: manual formulation by ontology engineers, instantiation of CQ patterns, and generation using state of the art LLMs. We generate CQs using each approach from a set of requirements for cultural heritage, and assess them across different dimensions: degree of acceptability, ambiguity, relevance, readability and complexity. Our contribution is twofold: (i) the first multi-annotator dataset of CQs generated from the same source using different methods; and (ii) a systematic comparison of the characteristics of the CQs resulting from each approach. Our study shows that different CQ generation approaches have different characteristics and that LLMs can be used as a way to initially elicit CQs, however these are sensitive to the model used to generate CQs and they generally require a further refinement step before they can be used to model requirements.
Problem

Research questions and friction points this paper is trying to address.

Compare methods for generating Competency Questions (CQs)
Evaluate CQ quality across acceptability, ambiguity, relevance
Assess LLM-generated CQs for ontology requirements refinement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Compare manual, pattern-based, LLM CQ generation
Evaluate CQs on acceptability, ambiguity, relevance
First multi-annotator dataset for CQ methods
🔎 Similar Papers
No similar papers found.