Graph-S3: Enhancing Agentic textual Graph Retrieval with Synthetic Stepwise Supervision

📅 2025-10-01
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the tension between graph retrieval in large text-graphs and the context-length limitations of large language models (LLMs). We propose an agent-based subgraph retrieval method. Our key contributions are: (1) a stepwise reward mechanism grounded in offline gold subgraphs, mitigating sparse feedback in reinforcement learning; and (2) a two-stage training framework coupled with an LLM-driven synthetic data pipeline, enabling precise multi-hop subgraph localization. Evaluated on three benchmark datasets, our method achieves average improvements of 8.1% in accuracy and 9.7% in F₁-score over seven strong baselines, with particularly pronounced gains on multi-hop tasks. The code will be publicly released.

Technology Category

Application Category

📝 Abstract
A significant portion of real-world data is inherently represented as textual graphs, and integrating these graphs into large language models (LLMs) is promising to enable complex graph-based question answering. However, a key challenge in LLM-based textual graph QA systems lies in graph retrieval, i.e., how to retrieve relevant content from large graphs that is sufficiently informative while remaining compact for the LLM context. Existing retrievers suffer from poor performance since they either rely on shallow embedding similarity or employ interactive retrieving policies that demand excessive data labeling and training cost. To address these issues, we present Graph-$S^3$, an agentic textual graph reasoning framework that employs an LLM-based retriever trained with synthetic stepwise supervision. Instead of rewarding the agent based on the final answers, which may lead to sparse and unstable training signals, we propose to closely evaluate each step of the retriever based on offline-extracted golden subgraphs. Our main techniques include a data synthesis pipeline to extract the golden subgraphs for reward generation and a two-stage training scheme to learn the interactive graph exploration policy based on the synthesized rewards. Based on extensive experiments on three common datasets in comparison with seven strong baselines, our approach achieves an average improvement of 8.1% in accuracy and 9.7% in F$_1$ score. The advantage is even higher in more complicated multi-hop reasoning tasks. Our code will be open-sourced.
Problem

Research questions and friction points this paper is trying to address.

Improving graph retrieval for large language models
Reducing data labeling costs in graph QA systems
Enhancing multi-hop reasoning on textual graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses synthetic stepwise supervision for graph retrieval
Trains agent with offline-extracted golden subgraphs
Implements two-stage training for exploration policy
🔎 Similar Papers
No similar papers found.
G
Ge Chang
Institute for AI Industry Research (AIR), Tsinghua University
J
Jinbo Su
Institute for AI Industry Research (AIR), Tsinghua University
J
Jiacheng Liu
Institute for AI Industry Research (AIR), Tsinghua University
Pengfei Yang
Pengfei Yang
Institute of Software, Chinese Academy of Sciences
Probabilistic model checkingDNN verification
Y
Yuhao Shang
Institute for AI Industry Research (AIR), Tsinghua University
H
Huiwen Zheng
GDS Holdings Limited
H
Hongli Ma
GDS Holdings Limited
Yan Liang
Yan Liang
Northwestern Polytechnical University
Information fusionState EstimationTarget tracking
Yuanchun Li
Yuanchun Li
Institute for AI Industry Research (AIR), Tsinghua University
mobile computingartificial intelligence
Yunxin Liu
Yunxin Liu
IEEE Fellow, Guoqiang Professor, Institute for AI Industry Research (AIR), Tsinghua University
Mobile ComputingEdge ComputingAIoTSystemNetworking