🤖 AI Summary
Addressing the longstanding trade-off between efficiency and quality in long-text generation, this paper proposes DrDiff, a dynamic routing diffusion framework. DrDiff integrates three core innovations: (1) a dynamic expert scheduling mechanism for adaptive computational resource allocation; (2) hierarchical sparse attention (HSA), which preserves global modeling capability while reducing time complexity to *O(n)*; and (3) a soft absorption guidance strategy, synergistically accelerating diffusion sampling with DPM-solver++. To our knowledge, DrDiff is the first diffusion-based method for long-text generation achieving both linear-time inference and high-quality outputs. Experiments across multiple long-text generation benchmarks demonstrate that DrDiff significantly outperforms existing state-of-the-art approaches—reducing diffusion steps by 60% and computational cost by 55%, accelerating generation by 2.3×, while simultaneously improving coherence, factual consistency, and lexical diversity.
📝 Abstract
This paper introduces DrDiff, a novel framework for long-text generation that overcomes the efficiency-quality trade-off through three core technologies. First, we design a dynamic expert scheduling mechanism that intelligently allocates computational resources during the diffusion process based on text complexity, enabling more efficient handling of text generation tasks of varying difficulty. Second, we introduce a Hierarchical Sparse Attention (HSA) mechanism that adaptively adjusts attention patterns according to a variety of input lengths, reducing computational complexity from O($n^2$) to O($n$) while maintaining model performance. Finally, we propose a soft absorption guidance optimization strategy that combines with DPM-solver++ to reduce diffusion steps, significantly improving generation speed. Comprehensive experiments on various long-text generation benchmarks demonstrate the superiority of our DrDiff over the existing SOTA methods.