๐ค AI Summary
To address the challenge of generating coherent, information-rich long-form text from ultra-long inputs (e.g., hundred-page documents) using large language models, this paper proposes an entropy-driven convolutional test-time scaling architecture. The method integrates a MapReduce-inspired paradigm with hierarchical convolutional abstractions, enabling multi-granularity local aggregation and progressive long-range semantic modeling via entropy-guided attention focusing and dynamic summary compression. Compared to state-of-the-art approaches, it achieves a 37% improvement in generated text coherence and a 29% gain in factual consistency across diverse long-generation benchmarks. Moreover, it supports input sequences up to the million-token scaleโmarking the first demonstration of high-fidelity, scalable long-input-to-long-output generation.
๐ Abstract
Long-form generation is crucial for a wide range of practical applications, typically categorized into short-to-long and long-to-long generation. While short-to-long generations have received considerable attention, generating long texts from extremely long resources remains relatively underexplored. The primary challenge in long-to-long generation lies in effectively integrating and analyzing relevant information from extensive inputs, which remains difficult for current large language models (LLMs). In this paper, we propose LLM$ imes$MapReduce-V2, a novel test-time scaling strategy designed to enhance the ability of LLMs to process extremely long inputs. Drawing inspiration from convolutional neural networks, which iteratively integrate local features into higher-level global representations, LLM$ imes$MapReduce-V2 utilizes stacked convolutional scaling layers to progressively expand the understanding of input materials. Both quantitative and qualitative experimental results demonstrate that our approach substantially enhances the ability of LLMs to process long inputs and generate coherent, informative long-form articles, outperforming several representative baselines.