Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph Question Answering

📅 2025-06-11

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

To address the weak generalization and insufficient structural support of graph-based retrievers in Knowledge Graph Question Answering (KGQA), this paper proposes RAPL: (1) a causal-driven two-stage annotation mechanism that explicitly models query-to-subgraph causal relationships; (2) a model-agnostic, universal graph transformation method that jointly encodes intra- and inter-triple interactions; and (3) a path-guided retrieval-reasoning decoupling paradigm, where path encoding enhances structured output. Evaluated on multiple KGQA benchmarks, RAPL outperforms state-of-the-art methods by 2.66%–20.34%, demonstrating significantly improved robustness and generalization across varying model scales and diverse datasets. Moreover, it delivers more interpretable and architecture-compatible structured inputs to downstream reasoning modules.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains, but their reliability is hindered by the outdated knowledge and hallucinations. Retrieval-Augmented Generation mitigates these issues by grounding LLMs with external knowledge; however, most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning. Knowledge graphs, which represent facts as relational triples, offer a more structured and compact alternative. Recent studies have explored integrating knowledge graphs with LLMs for knowledge graph question answering (KGQA), with a significant proportion adopting the retrieve-then-reasoning paradigm. In this framework, graph-based retrievers have demonstrated strong empirical performance, yet they still face challenges in generalization ability. In this work, we propose RAPL, a novel framework for efficient and effective graph retrieval in KGQA. RAPL addresses these limitations through three aspects: (1) a two-stage labeling strategy that combines heuristic signals with parametric models to provide causally grounded supervision; (2) a model-agnostic graph transformation approach to capture both intra- and inter-triple interactions, thereby enhancing representational capacity; and (3) a path-based reasoning strategy that facilitates learning from the injected rational knowledge, and supports downstream reasoner through structured inputs. Empirically, RAPL outperforms state-of-the-art methods by $2.66%-20.34%$, and significantly reduces the performance gap between smaller and more powerful LLM-based reasoners, as well as the gap under cross-dataset settings, highlighting its superior retrieval capability and generalizability. Codes are available at: https://github.com/tianyao-aka/RAPL.

Problem

Research questions and friction points this paper is trying to address.

Enhancing generalization in graph retrievers for KGQA

Improving structured reasoning with knowledge graphs

Reducing reliability issues in LLMs via retrieval

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage labeling combines heuristic and parametric models

Model-agnostic graph transformation captures triple interactions

Path-based reasoning enhances structured knowledge injection

🔎 Similar Papers

Dual Reasoning: A GNN-LLM Collaborative Framework for Knowledge Graph Question Answering