CogGen: A Cognitively Inspired Recursive Framework for Deep Research Report Generation

πŸ“… 2026-04-18
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

195K/year
πŸ€– AI Summary
This study addresses critical limitations of current large language models in generating in-depth research reports, which are constrained by linear generation pipelines that propagate errors, impede global restructuring, and hinder effective multimodal integration. To overcome these challenges, the authors propose a cognitively inspired recursive generation framework featuring a hierarchical recursive architecture that enables dynamic planning and holistic structural refinement. Central to this approach is the introduction of Abstract Visual Representations (AVR), which facilitate efficient text–image layout optimization without requiring pixel-level regeneration. The work makes three key contributions: it pioneers the application of cognitive recursion to report generation, introduces the Cognitive Load Evaluation Framework (CLEF), and releases a novel multimodal benchmark dataset derived from Our World in Data. Experimental results demonstrate that the proposed system achieves state-of-the-art performance among open-source models, producing outputs comparable to those of professional analysts and surpassing Gemini Deep Research in quality.

Technology Category

Application Category

πŸ“ Abstract
The autonomous synthesis of deep research reports represents a critical frontier for Large Language Models (LLMs), demanding sophisticated information orchestration and non-linear narrative logic. Current approaches rely on rigid predefined linear workflows, which cause error accumulation, preclude global restructuring from subsequent insights, and ultimately limit in-depth multimodal fusion and report quality. We propose CogGen, a Cognitively inspired recursive framework for deep research report Generation. Leveraging a Hierarchical Recursive Architecture to simulate cognitive writing, CogGen enables flexible planning and global restructuring. To extend this recursivity to multimodal content, we introduce Abstract Visual Representation (AVR): a concise intent-driven language that iteratively refines visual-text layouts without pixel-level regeneration overhead. We further present CLEF, a Cognitive Load Evaluation Framework, and curate a new benchmark from Our World in Data (OWID). Extensive experiments show CogGen achieves state-of-the-art results among open-source systems, generating reports comparable to professional analysts' outputs and surpassing Gemini Deep Research. Our code and dataset are available at https://github.com/NJUNLP/CogGen.
Problem

Research questions and friction points this paper is trying to address.

deep research report generation
Large Language Models
multimodal fusion
non-linear narrative logic
error accumulation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Recursive Framework
Cognitive Inspiration
Abstract Visual Representation
Multimodal Fusion
Report Generation
πŸ”Ž Similar Papers
No similar papers found.