Dep-Search: Learning Dependency-Aware Reasoning Traces with Persistent Memory

📅 2026-01-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing search frameworks in explicitly modeling subproblem dependencies, reusing retrieved knowledge, and optimizing multi-step reasoning strategies. To this end, the authors propose Dep-Search, a novel framework that captures subproblem dependencies through structured reasoning, enables knowledge reuse via on-demand retrieval coupled with a persistent memory mechanism, and unifies the scheduling of question decomposition, retrieval, and memory management through dependency-aware control. Furthermore, the framework employs the GRPO reinforcement learning algorithm to end-to-end optimize a learnable search policy. Extensive experiments across seven question-answering datasets demonstrate that Dep-Search significantly enhances multi-hop reasoning performance, consistently outperforming strong baselines across models of varying scales.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities in complex reasoning tasks, particularly when augmented with search mechanisms that enable systematic exploration of external knowledge bases. The field has evolved from traditional retrieval-augmented generation (RAG) frameworks to more sophisticated search-based frameworks that orchestrate multi-step reasoning through explicit search strategies. However, existing search frameworks still rely heavily on implicit natural language reasoning to determine search strategies and how to leverage retrieved information across reasoning steps. This reliance on implicit reasoning creates fundamental challenges for managing dependencies between sub-questions, efficiently reusing previously retrieved knowledge, and learning optimal search strategies through reinforcement learning. To address these limitations, we propose Dep-Search, a dependency-aware search framework that advances beyond existing search frameworks by integrating structured reasoning, retrieval, and persistent memory through GRPO. Dep-Search introduces explicit control mechanisms that enable the model to decompose questions with dependency relationships, retrieve information when needed, access previously stored knowledge from memory, and summarize long reasoning contexts into reusable memory entries. Through extensive experiments on seven diverse question answering datasets, we demonstrate that Dep-Search significantly enhances LLMs'ability to tackle complex multi-hop reasoning tasks, achieving substantial improvements over strong baselines across different model scales.
Problem

Research questions and friction points this paper is trying to address.

dependency-aware reasoning
search-based reasoning
multi-hop reasoning
retrieval-augmented generation
persistent memory
Innovation

Methods, ideas, or system contributions that make the work stand out.

dependency-aware reasoning
persistent memory
structured retrieval
multi-hop QA
explicit control mechanism
🔎 Similar Papers
No similar papers found.