WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research

📅 2025-09-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing AI agents face two key bottlenecks in Open-Ended Deep Research (OEDR): (1) static, decoupled pipelines that separate planning from evidence acquisition, and (2) monolithic long-text generation prone to “intermediate token loss” and hallucination. This paper proposes a dynamic dual-agent framework that tightly couples a Planner and a Writer to enable closed-loop, iterative coordination among evidence retrieval, hierarchical outline evolution, and content generation. Its core contributions are: (1) memory-augmented dynamic evidence management; (2) iterative, hierarchical outline optimization; and (3) chunked retrieval-augmented generation with source attribution for faithful content synthesis. By departing from rigid pipeline and single-pass generation paradigms, the framework substantially mitigates context drift and factual inconsistency. It achieves state-of-the-art performance on DeepResearch Bench, DeepConsult, and DeepResearchGym—demonstrating significant improvements in report quality, factual accuracy, and structural coherence.

Technology Category

Application Category

📝 Abstract
This paper tackles open-ended deep research (OEDR), a complex challenge where AI agents must synthesize vast web-scale information into insightful reports. Current approaches are plagued by dual-fold limitations: static research pipelines that decouple planning from evidence acquisition and one-shot generation paradigms that easily suffer from long-context failure issues like "loss in the middle" and hallucinations. To address these challenges, we introduce WebWeaver, a novel dual-agent framework that emulates the human research process. The planner operates in a dynamic cycle, iteratively interleaving evidence acquisition with outline optimization to produce a comprehensive, source-grounded outline linking to a memory bank of evidence. The writer then executes a hierarchical retrieval and writing process, composing the report section by section. By performing targeted retrieval of only the necessary evidence from the memory bank for each part, it effectively mitigates long-context issues. Our framework establishes a new state-of-the-art across major OEDR benchmarks, including DeepResearch Bench, DeepConsult, and DeepResearchGym. These results validate our human-centric, iterative methodology, demonstrating that adaptive planning and focused synthesis are crucial for producing high-quality, reliable, and well-structured reports.
Problem

Research questions and friction points this paper is trying to address.

Addressing open-ended deep research with web-scale information synthesis
Overcoming static pipelines and one-shot generation limitations
Mitigating long-context failure issues and hallucinations in reports
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-agent framework emulates human research
Dynamic cycle interleaves evidence and outline
Hierarchical retrieval mitigates long-context issues
🔎 Similar Papers
No similar papers found.
Z
Zijian Li
Tongyi Lab, Alibaba Group
Xin Guan
Xin Guan
Research, Holistic AI
Ethical AI and Normative Reasoning
B
Bo Zhang
Tongyi Lab, Alibaba Group
Shen Huang
Shen Huang
Director of Search, Yihaodian.com
Machine learningdata miningsearchrecommendationpersonalization
Houquan Zhou
Houquan Zhou
Soochow University
S
Shaopeng Lai
Tongyi Lab, Alibaba Group
M
Ming Yan
Tongyi Lab, Alibaba Group
Y
Yong Jiang
Tongyi Lab, Alibaba Group
Pengjun Xie
Pengjun Xie
Alibaba Group
NLP/IR/ML
F
Fei Huang
Tongyi Lab, Alibaba Group
J
Jun Zhang
Tongyi Lab, Alibaba Group
Jingren Zhou
Jingren Zhou
Alibaba Group, Microsoft
Cloud ComputingLarge Scale Distributed SystemsMachine LearningQuery ProcessingQuery