WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

To address critical challenges in long-horizon knowledge exploration by AI agents—including context window limitations, noise interference, and the disconnect between passive recall and active knowledge construction—this paper proposes an autonomous research agent endowed with long-range reasoning capabilities. Methodologically, deep research is formalized as a Markov Decision Process; we introduce a dynamic report integration mechanism and a parallel thinking expansion architecture, integrated with tool-augmented progressive data synthesis, iterative retrieval-reasoning-summarization, focused workspace management, and multi-agent collaborative reasoning. Our key contribution is the first systematic realization of an active knowledge construction paradigm. Experiments demonstrate state-of-the-art performance across six challenging benchmarks, with substantial improvements in tool utilization efficacy and knowledge generation quality, consistently outperforming leading proprietary systems.

Technology Category

Application Category

📝 Abstract

Recent advances in deep-research systems have demonstrated the potential for AI agents to autonomously discover and synthesize knowledge from external sources. In this paper, we introduce WebResearcher, a novel framework for building such agents through two key components: (1) WebResearcher, an iterative deep-research paradigm that reformulates deep research as a Markov Decision Process, where agents periodically consolidate findings into evolving reports while maintaining focused workspaces, overcoming the context suffocation and noise contamination that plague existing mono-contextual approaches; and (2) WebFrontier, a scalable data synthesis engine that generates high-quality training data through tool-augmented complexity escalation, enabling systematic creation of research tasks that bridge the gap between passive knowledge recall and active knowledge construction. Notably, we find that the training data from our paradigm significantly enhances tool-use capabilities even for traditional mono-contextual methods. Furthermore, our paradigm naturally scales through parallel thinking, enabling concurrent multi-agent exploration for more comprehensive conclusions. Extensive experiments across 6 challenging benchmarks demonstrate that WebResearcher achieves state-of-the-art performance, even surpassing frontier proprietary systems.

Problem

Research questions and friction points this paper is trying to address.

Overcoming context suffocation and noise in mono-contextual research methods

Bridging passive knowledge recall with active knowledge construction

Enhancing tool-use capabilities through scalable data synthesis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative deep-research paradigm using Markov Decision Process

Scalable data synthesis engine with tool-augmented complexity escalation

Parallel thinking enabling concurrent multi-agent exploration

🔎 Similar Papers

Long-Horizon Planning for Multi-Agent Robots in Partially Observable Environments

2024-07-14arXiv.orgCitations: 1

A Survey on Large Language Model based Autonomous Agents

2023-08-22Frontiers Comput. Sci.Citations: 866

Datadog

$140,000—$400,000 USD

New York City / Paris

Authors to Follow