Can Knowledge-Graph-based Retrieval Augmented Generation Really Retrieve What You Need?

📅 2025-10-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing knowledge graph (KG)-based retrieval-augmented generation (RAG) methods struggle to balance retrieval accuracy and diversity for complex queries over text-rich KGs. This paper proposes GraphFlow, a framework that jointly optimizes a retrieval policy network and a flow estimator to enable targeted exploration of high-quality knowledge subgraphs. Its core innovation is a transitional flow-matching mechanism: it decomposes terminal rewards across intermediate states to implicitly guide policy learning without requiring process-level supervision; further, it employs a transition-based flow-matching objective coupled with reward factorization. Evaluated on the STaRK benchmark, GraphFlow achieves an average 10% improvement in hit rate and recall over strong baselines including GPT-4o. Moreover, it demonstrates strong generalization to unseen KGs, underscoring its robustness and scalability.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) based on knowledge graphs (KGs) enhances large language models (LLMs) by providing structured and interpretable external knowledge. However, existing KG-based RAG methods struggle to retrieve accurate and diverse information from text-rich KGs for complex real-world queries. Process Reward Models (PRMs) offer a way to align the retrieval process of KG-based RAG with query-specific knowledge requirements, but they heavily rely on process-level supervision signals that are expensive and hard to obtain on KGs. To address this challenge, we propose GraphFlow, a framework that efficiently retrieves accurate and diverse knowledge required for real-world queries from text-rich KGs. GraphFlow employs a transition-based flow matching objective to jointly optimize a retrieval policy and a flow estimator. The flow estimator factorizes the reward of the retrieval outcome into the intermediate retrieval states. Such reward factorization guides the retrieval policy to retrieve candidates from KGs in proportion to their reward. This allows GraphFlow to explore high-quality regions of KGs that yield diverse and relevant results. We evaluate GraphFlow on the STaRK benchmark, which includes real-world queries from multiple domains over text-rich KGs. GraphFlow outperforms strong KG-RAG baselines, including GPT-4o, by 10% on average in hit rate and recall. It also shows strong generalization to unseen KGs, demonstrating its effectiveness and robustness.
Problem

Research questions and friction points this paper is trying to address.

Retrieving accurate diverse information from text-rich knowledge graphs
Aligning KG-based RAG retrieval with query-specific knowledge requirements
Reducing reliance on expensive process-level supervision signals
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transition-based flow matching optimizes retrieval policy
Flow estimator factorizes reward into intermediate states
Framework retrieves diverse knowledge from text-rich graphs
🔎 Similar Papers
No similar papers found.