π€ AI Summary
This study addresses critical limitations of current large language models in generating in-depth research reports, which are constrained by linear generation pipelines that propagate errors, impede global restructuring, and hinder effective multimodal integration. To overcome these challenges, the authors propose a cognitively inspired recursive generation framework featuring a hierarchical recursive architecture that enables dynamic planning and holistic structural refinement. Central to this approach is the introduction of Abstract Visual Representations (AVR), which facilitate efficient textβimage layout optimization without requiring pixel-level regeneration. The work makes three key contributions: it pioneers the application of cognitive recursion to report generation, introduces the Cognitive Load Evaluation Framework (CLEF), and releases a novel multimodal benchmark dataset derived from Our World in Data. Experimental results demonstrate that the proposed system achieves state-of-the-art performance among open-source models, producing outputs comparable to those of professional analysts and surpassing Gemini Deep Research in quality.
π Abstract
The autonomous synthesis of deep research reports represents a critical frontier for Large Language Models (LLMs), demanding sophisticated information orchestration and non-linear narrative logic. Current approaches rely on rigid predefined linear workflows, which cause error accumulation, preclude global restructuring from subsequent insights, and ultimately limit in-depth multimodal fusion and report quality. We propose CogGen, a Cognitively inspired recursive framework for deep research report Generation. Leveraging a Hierarchical Recursive Architecture to simulate cognitive writing, CogGen enables flexible planning and global restructuring. To extend this recursivity to multimodal content, we introduce Abstract Visual Representation (AVR): a concise intent-driven language that iteratively refines visual-text layouts without pixel-level regeneration overhead. We further present CLEF, a Cognitive Load Evaluation Framework, and curate a new benchmark from Our World in Data (OWID). Extensive experiments show CogGen achieves state-of-the-art results among open-source systems, generating reports comparable to professional analysts' outputs and surpassing Gemini Deep Research. Our code and dataset are available at https://github.com/NJUNLP/CogGen.