Reasoning by Exploration: A Unified Approach to Retrieval and Generation over Graphs

📅 2025-10-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) face two key challenges in reasoning over large-scale structured graphs: insufficient structural integration and poor cross-graph generalization. To address these, we propose the “Reasoning-as-Exploration” (RoE) framework, which reformulates graph reasoning as a dynamic path exploration process—where an LLM iteratively selects nodes and edges to jointly perform retrieval and generation, thereby replacing the conventional two-stage RAG paradigm. RoE incorporates supervision via gold-standard reasoning paths during fine-tuning and further refines the exploration policy through reinforcement learning. Experiments across multiple benchmarks demonstrate that RoE significantly outperforms state-of-the-art methods, achieving substantial gains in overall reasoning accuracy while exhibiting strong generalization to unseen graphs. This work establishes a novel, unified paradigm for structured graph reasoning grounded in sequential, goal-directed exploration.

Technology Category

Application Category

📝 Abstract
Reasoning over structured graphs remains a fundamental challenge for Large Language Models (LLMs), particularly when scaling to large graphs. Existing approaches typically follow the retrieval-augmented generation (RAG) paradigm: first retrieving subgraphs relevant to the query and then generating answers conditioned on the retrieved subgraphs. However, such two-phase pipelines often struggle to faithfully incorporate graph structure, since the generation process is ultimately constrained by the quality and completeness of the retrieved subgraph. Although many advanced retrievers have been proposed recently to mitigate this issue, they are usually tailored to the training graphs and generalize poorly to unseen graphs, which limits their practical applicability. In this work, we propose Reasoning by Exploration (RoE), a novel approach that unifies retrieval and generation by framing reasoning over graphs as a process of graph exploration. At each step, the LLM selects candidate nodes and edges to explore, gradually constructing reasoning paths and generating answers along the way. To enable effective exploration, RoE is trained in two stages: supervised fine-tuning (SFT) on gold reasoning paths, followed by reinforcement learning (RL) to enhance exploration effectiveness and generalization. Experiments on benchmark datasets demonstrate that RoE achieves substantial overall improvements over baselines, while also generalizing effectively to unseen graphs.
Problem

Research questions and friction points this paper is trying to address.

Addresses LLM reasoning challenges over large structured graphs
Unifies retrieval and generation through graph exploration process
Enhances generalization to unseen graphs via multi-stage training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unifies retrieval and generation via graph exploration
Trains with supervised fine-tuning and reinforcement learning
Enhances generalization to unseen graph structures
🔎 Similar Papers
No similar papers found.