๐ค AI Summary
Traditional information search (IS) agents face dual bottlenecks in parallel reasoning: inefficient redundant inference expansion and severe context-length constraints that hinder integration of long-horizon reasoning trajectories. To address these, we propose ParallelMuseโa two-stage paradigm that first performs functionally partitioned partial rollouts guided by uncertainty estimation to efficiently prune low-potential paths early; then applies redundancy-aware lossless compression aggregation to compactly encode multi-path reasoning, thereby overcoming context limitations. By jointly optimizing exploration breadth and depth, ParallelMuse achieves up to 62% performance gains across multiple open-source IS agents and benchmarks, while reducing exploratory token consumption by 10โ30%. The approach advances scalable parallel reasoning for long-context information search without sacrificing fidelity or coverage.
๐ Abstract
Parallel thinking expands exploration breadth, complementing the deep exploration of information-seeking (IS) agents to further enhance problem-solving capability. However, conventional parallel thinking faces two key challenges in this setting: inefficiency from repeatedly rolling out from scratch, and difficulty in integrating long-horizon reasoning trajectories during answer generation, as limited context capacity prevents full consideration of the reasoning process. To address these issues, we propose ParallelMuse, a two-stage paradigm designed for deep IS agents. The first stage, Functionality-Specified Partial Rollout, partitions generated sequences into functional regions and performs uncertainty-guided path reuse and branching to enhance exploration efficiency. The second stage, Compressed Reasoning Aggregation, exploits reasoning redundancy to losslessly compress information relevant to answer derivation and synthesize a coherent final answer. Experiments across multiple open-source agents and benchmarks demonstrate up to 62% performance improvement with a 10--30% reduction in exploratory token consumption.