Query-Specific GNN: A Comprehensive Graph Representation Learning Method for Retrieval Augmented Generation

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the degradation of Retrieval-Augmented Generation (RAG) performance in multi-hop question answering—caused by high semantic complexity and strong retrieval noise—this paper proposes a knowledge graph–based, query-guided graph representation learning framework. The method constructs a multi-level knowledge graph and designs a query-aware message-passing mechanism to enable hierarchical, cross-hop information aggregation within a graph neural network. Additionally, it employs a synthetic data pretraining strategy to explicitly model long-range dependencies and suppress irrelevant noise. Experimental results demonstrate that the proposed framework significantly outperforms existing RAG approaches on multi-hop QA benchmarks, achieving a 33.8% performance gain on high-hop (≥4-hop) questions. These improvements validate the framework’s effectiveness in enhancing complex semantic understanding and robust retrieval augmentation.

Technology Category

Application Category

📝 Abstract
Retrieval-augmented generation (RAG) has demonstrated its ability to enhance Large Language Models (LLMs) by integrating external knowledge sources. However, multi-hop questions, which require the identification of multiple knowledge targets to form a synthesized answer, raise new challenges for RAG systems. Under the multi-hop settings, existing methods often struggle to fully understand the questions with complex semantic structures and are susceptible to irrelevant noise during the retrieval of multiple information targets. To address these limitations, we propose a novel graph representation learning framework for multi-hop question retrieval. We first introduce a Multi-information Level Knowledge Graph (Multi-L KG) to model various information levels for a more comprehensive understanding of multi-hop questions. Based on this, we design a Query-Specific Graph Neural Network (QSGNN) for representation learning on the Multi-L KG. QSGNN employs intra/inter-level message passing mechanisms, and in each message passing the information aggregation is guided by the query, which not only facilitates multi-granular information aggregation but also significantly reduces the impact of noise. To enhance its ability to learn robust representations, we further propose two synthesized data generation strategies for pre-training the QSGNN. Extensive experimental results demonstrate the effectiveness of our framework in multi-hop scenarios, especially in high-hop questions the improvement can reach 33.8%. The code is available at: https://github.com/Jerry2398/QSGNN.
Problem

Research questions and friction points this paper is trying to address.

Enhancing multi-hop question answering in retrieval-augmented generation systems
Addressing complex semantic understanding and noise in multi-information retrieval
Improving robustness through query-specific graph neural network representation learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-level knowledge graph models complex question semantics
Query-specific GNN reduces noise through guided message passing
Synthesized data generation strategies enable robust pre-training
🔎 Similar Papers
No similar papers found.