Towards LLM-generated explanations for Component-based Knowledge Graph Question Answering Systems

📅 2025-08-20

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

To address the insufficient explainability of component-based knowledge graph question answering (KGQA) systems, this paper proposes an automated explanation generation method that integrates large language models (LLMs) with dataflow analysis. Our approach parses SPARQL queries and RDF triple dataflows across components to construct semantically coherent behavioral traces, then leverages LLMs for end-to-end natural language explanation generation—replacing conventional handcrafted templates. Experimental results demonstrate that LLM-generated explanations significantly outperform template-based baselines in accuracy, coherence, and user comprehensibility (p < 0.01), thereby aiding developer debugging and enhancing end-user trust. This work introduces the first data-driven, scalable, and high-quality explanation framework tailored to component-level behavior in KGQA systems.

Technology Category

Application Category

📝 Abstract

Over time, software systems have reached a level of complexity that makes it difficult for their developers and users to explain particular decisions made by them. In this paper, we focus on the explainability of component-based systems for Question Answering (QA). These components often conduct processes driven by AI methods, in which behavior and decisions cannot be clearly explained or justified, s.t., even for QA experts interpreting the executed process and its results is hard. To address this challenge, we present an approach that considers the components' input and output data flows as a source for representing the behavior and provide explanations for the components, enabling users to comprehend what happened. In the QA framework used here, the data flows of the components are represented as SPARQL queries (inputs) and RDF triples (outputs). Hence, we are also providing valuable insights on verbalization regarding these data types. In our experiments, the approach generates explanations while following template-based settings (baseline) or via the use of Large Language Models (LLMs) with different configurations (automatic generation). Our evaluation shows that the explanations generated via LLMs achieve high quality and mostly outperform template-based approaches according to the users' ratings. Therefore, it enables us to automatically explain the behavior and decisions of QA components to humans while using RDF and SPARQL as a context for explanations.

Problem

Research questions and friction points this paper is trying to address.

Explaining complex component-based QA system decisions

Generating explanations for AI-driven KGQA components

Verbalizing RDF and SPARQL data flows automatically

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-generated explanations for component behavior

Using SPARQL and RDF as context data

Automatic verbalization of data flows

🔎 Similar Papers

Dual Reasoning: A GNN-LLM Collaborative Framework for Knowledge Graph Question Answering