Beyond Monolithic Architectures: A Multi-Agent Search and Knowledge Optimization Framework for Agentic Search

📅 2026-01-08
🏛️ arXiv.org
📈 Citations: 3
Influential: 1
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing agent-based search systems, which suffer from prolonged reasoning chains, sparse rewards, and credit assignment difficulties due to their monolithic architecture, ultimately undermining learning stability. To overcome these challenges, the authors propose M-ASK, a novel framework that decouples the search task into two specialized agents: a search behavior agent responsible for action execution and a knowledge management agent tasked with maintaining a compressed contextual representation. These agents are jointly optimized through turn-level fine-grained rewards. By integrating a multi-agent architecture, large language model tool invocation, and context compression techniques, M-ASK significantly outperforms strong baselines on multi-hop question answering benchmarks, achieving both higher answer accuracy and markedly improved training stability.

Technology Category

Application Category

📝 Abstract
Agentic search has emerged as a promising paradigm for complex information seeking by enabling Large Language Models (LLMs) to interleave reasoning with tool use. However, prevailing systems rely on monolithic agents that suffer from structural bottlenecks, including unconstrained reasoning outputs that inflate trajectories, sparse outcome-level rewards that complicate credit assignment, and stochastic search noise that destabilizes learning. To address these challenges, we propose \textbf{M-ASK} (Multi-Agent Search and Knowledge), a framework that explicitly decouples agentic search into two complementary roles: Search Behavior Agents, which plan and execute search actions, and Knowledge Management Agents, which aggregate, filter, and maintain a compact internal context. This decomposition allows each agent to focus on a well-defined subtask and reduces interference between search and context construction. Furthermore, to enable stable coordination, M-ASK employs turn-level rewards to provide granular supervision for both search decisions and knowledge updates. Experiments on multi-hop QA benchmarks demonstrate that M-ASK outperforms strong baselines, achieving not only superior answer accuracy but also significantly more stable training dynamics.\footnote{The source code for M-ASK is available at https://github.com/chenyiqun/M-ASK.}
Problem

Research questions and friction points this paper is trying to address.

agentic search
monolithic architecture
credit assignment
search noise
reasoning trajectories
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent framework
agentic search
knowledge optimization
turn-level rewards
modular reasoning
🔎 Similar Papers
No similar papers found.