WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Existing AI agents face two key bottlenecks in Open-Ended Deep Research (OEDR): (1) static, decoupled pipelines that separate planning from evidence acquisition, and (2) monolithic long-text generation prone to “intermediate token loss” and hallucination. This paper proposes a dynamic dual-agent framework that tightly couples a Planner and a Writer to enable closed-loop, iterative coordination among evidence retrieval, hierarchical outline evolution, and content generation. Its core contributions are: (1) memory-augmented dynamic evidence management; (2) iterative, hierarchical outline optimization; and (3) chunked retrieval-augmented generation with source attribution for faithful content synthesis. By departing from rigid pipeline and single-pass generation paradigms, the framework substantially mitigates context drift and factual inconsistency. It achieves state-of-the-art performance on DeepResearch Bench, DeepConsult, and DeepResearchGym—demonstrating significant improvements in report quality, factual accuracy, and structural coherence.

Technology Category

Application Category

📝 Abstract

This paper tackles open-ended deep research (OEDR), a complex challenge where AI agents must synthesize vast web-scale information into insightful reports. Current approaches are plagued by dual-fold limitations: static research pipelines that decouple planning from evidence acquisition and one-shot generation paradigms that easily suffer from long-context failure issues like "loss in the middle" and hallucinations. To address these challenges, we introduce WebWeaver, a novel dual-agent framework that emulates the human research process. The planner operates in a dynamic cycle, iteratively interleaving evidence acquisition with outline optimization to produce a comprehensive, source-grounded outline linking to a memory bank of evidence. The writer then executes a hierarchical retrieval and writing process, composing the report section by section. By performing targeted retrieval of only the necessary evidence from the memory bank for each part, it effectively mitigates long-context issues. Our framework establishes a new state-of-the-art across major OEDR benchmarks, including DeepResearch Bench, DeepConsult, and DeepResearchGym. These results validate our human-centric, iterative methodology, demonstrating that adaptive planning and focused synthesis are crucial for producing high-quality, reliable, and well-structured reports.

Problem

Research questions and friction points this paper is trying to address.

Addressing open-ended deep research with web-scale information synthesis

Overcoming static pipelines and one-shot generation limitations

Mitigating long-context failure issues and hallucinations in reports

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-agent framework emulates human research

Dynamic cycle interleaves evidence and outline

Hierarchical retrieval mitigates long-context issues

🔎 Similar Papers

SciDaSynth: Interactive Structured Knowledge Extraction and Synthesis from Scientific Literature with Large Language Model

2024-04-21arXiv.orgCitations: 4

💼 Related Jobs

RE / RS - Foundations, Search

OpenAI

$445K – $555K • Offers Equity

San Francisco, CA, USA

Authors to Follow