DrDiff: Dynamic Routing Diffusion with Hierarchical Attention for Breaking the Efficiency-Quality Trade-off

📅 2025-09-02

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Addressing the longstanding trade-off between efficiency and quality in long-text generation, this paper proposes DrDiff, a dynamic routing diffusion framework. DrDiff integrates three core innovations: (1) a dynamic expert scheduling mechanism for adaptive computational resource allocation; (2) hierarchical sparse attention (HSA), which preserves global modeling capability while reducing time complexity to *O(n)*; and (3) a soft absorption guidance strategy, synergistically accelerating diffusion sampling with DPM-solver++. To our knowledge, DrDiff is the first diffusion-based method for long-text generation achieving both linear-time inference and high-quality outputs. Experiments across multiple long-text generation benchmarks demonstrate that DrDiff significantly outperforms existing state-of-the-art approaches—reducing diffusion steps by 60% and computational cost by 55%, accelerating generation by 2.3×, while simultaneously improving coherence, factual consistency, and lexical diversity.

Technology Category

Application Category

📝 Abstract

This paper introduces DrDiff, a novel framework for long-text generation that overcomes the efficiency-quality trade-off through three core technologies. First, we design a dynamic expert scheduling mechanism that intelligently allocates computational resources during the diffusion process based on text complexity, enabling more efficient handling of text generation tasks of varying difficulty. Second, we introduce a Hierarchical Sparse Attention (HSA) mechanism that adaptively adjusts attention patterns according to a variety of input lengths, reducing computational complexity from O($n^2$) to O($n$) while maintaining model performance. Finally, we propose a soft absorption guidance optimization strategy that combines with DPM-solver++ to reduce diffusion steps, significantly improving generation speed. Comprehensive experiments on various long-text generation benchmarks demonstrate the superiority of our DrDiff over the existing SOTA methods.

Problem

Research questions and friction points this paper is trying to address.

Overcoming efficiency-quality trade-off in long-text generation

Reducing computational complexity from O(n^2) to O(n)

Improving generation speed while maintaining performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic expert scheduling for resource allocation

Hierarchical Sparse Attention reduces complexity

Soft absorption guidance optimizes diffusion steps

🔎 Similar Papers

Real-Time Video Generation with Pyramid Attention Broadcast

2024-08-22arXiv.orgCitations: 16

TikTok

San Jose, California

PhD - Effiziente Neuronale Repräsentation von Datensätzen

Bosch Group

Renningen, BW, DE

Authors to Follow