🤖 AI Summary
This work addresses two key challenges in interactive deep survey generation: imprecise research perspective capture and unsystematic framework construction. To tackle these, we propose a hierarchical multi-agent system grounded in the Model-Context-Protocol (MCP) paradigm. Methodologically, we design a decoupled, modular architecture that encapsulates atomic capabilities—such as skeleton initialization, summary construction, and iterative refinement—as schedulable services. A high-level planning agent dynamically orchestrates multi-turn interactive workflows based on tool descriptions and execution history, enabling human-in-the-loop intervention and flexible composition. Our primary contribution is the first application of the MCP paradigm to long-text deep survey generation, markedly enhancing user control and customization over the generation process. Human evaluation demonstrates that our system outperforms mainstream baselines in both content depth and comprehensiveness, validating the efficacy of modular, collaborative planning for complex generative tasks.
📝 Abstract
We introduce LLM x MapReduce-V3, a hierarchically modular agent system designed for long-form survey generation. Building on the prior work, LLM x MapReduce-V2, this version incorporates a multi-agent architecture where individual functional components, such as skeleton initialization, digest construction, and skeleton refinement, are implemented as independent model-context-protocol (MCP) servers. These atomic servers can be aggregated into higher-level servers, creating a hierarchically structured system. A high-level planner agent dynamically orchestrates the workflow by selecting appropriate modules based on their MCP tool descriptions and the execution history. This modular decomposition facilitates human-in-the-loop intervention, affording users greater control and customization over the research process. Through a multi-turn interaction, the system precisely captures the intended research perspectives to generate a comprehensive skeleton, which is then developed into an in-depth survey. Human evaluations demonstrate that our system surpasses representative baselines in both content depth and length, highlighting the strength of MCP-based modular planning.