Retrieving Minimal and Sufficient Reasoning Subgraphs with Graph Foundation Models for Path-aware GraphRAG

πŸ“… 2026-03-07
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limitations of existing GraphRAG methods, which rely on heuristic rules in cold-start scenarios and struggle to produce reasoning subgraphs that are both structurally concise and informationally complete, resulting in poor generalization. To overcome this, we propose a cross-domain subgraph retrieval framework based on a pre-trained Graph Foundation Model (GFM) that directly generates multi-hop path-aware reasoning subgraphs in response to queries. An unsupervised information bottleneck mechanism is introduced to automatically distill a minimal yet sufficient β€œcore evidence set.” By reorganizing relational paths into interpretable contextual prompts, our approach significantly enhances retrieval quality and answer generation performance on multi-hop question answering benchmarks while maintaining efficiency, marking the first successful reuse of graph foundation models for cross-domain subgraph retrieval.

Technology Category

Application Category

πŸ“ Abstract
Graph-based retrieval-augmented generation (GraphRAG) exploits structured knowledge to support knowledge-intensive reasoning. However, most existing methods treat graphs as intermediate artifacts, and the few subgraph-based retrieval methods depend on heuristic rules coupled with domain-specific distributions. They fail in typical cold-start scenarios where data in target domains is scarce, thus yielding reasoning contexts that are either informationally incomplete or structurally redundant. In this work, we revisit retrieval from a structural perspective, and propose GFM-Retriever that directly responds to user queries with a subgraph, where a pre-trained Graph Foundation Model acts as a cross-domain Retriever for multi-hop path-aware reasoning. Building on this perspective, we repurpose a pre-trained GFM from an entity ranking function into a generalized retriever to support cross-domain retrieval. On top of the retrieved graph, we further derive a label-free subgraph selector optimized by a principled Information Bottleneck objective to identify the query-conditioned subgraph, which contains informationally sufficient and structurally minimal golden evidence in a self-contained"core set". To connect structure with generation, we explicitly extract and reorganize relational paths as in-context prompts, enabling interpretable reasoning. Extensive experiments on multi-hop question answering benchmarks demonstrate that GFM-Retriever achieves state-of-the-art performance in both retrieval quality and answer generation, while maintaining efficiency.
Problem

Research questions and friction points this paper is trying to address.

GraphRAG
subgraph retrieval
cold-start
multi-hop reasoning
structured knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Foundation Model
Subgraph Retrieval
Information Bottleneck
Path-aware Reasoning
GraphRAG
πŸ”Ž Similar Papers
No similar papers found.