🤖 AI Summary
Existing diffusion language models (DLMs) suffer from inherent limitations in their denoising strategies: standard sequential diffusion is prone to premature sequence termination due to context truncation, while block-wise diffusion disrupts semantic coherence and logical reasoning through fixed-length token segmentation. To address this, we propose WavefrontDiffusion—a dynamic wavefront decoding framework that initiates denoising from already-determined tokens and adaptively expands the active decoding region outward, thereby aligning the denoising process with linguistic semantic structure. Our method preserves the parallel efficiency and computational budget of block-based decoding while enabling fine-grained, structure-aware adaptive denoising. Evaluated on four reasoning and code-generation benchmarks, WavefrontDiffusion significantly outperforms existing DLMs, achieving state-of-the-art performance in semantic fidelity, logical consistency, and task-specific accuracy.
📝 Abstract
Diffusion Language Models (DLMs) have shown strong potential for text generation and are becoming a competitive alternative to autoregressive models. The denoising strategy plays an important role in determining the quality of their outputs. Mainstream denoising strategies include Standard Diffusion and BlockDiffusion. Standard Diffusion performs global denoising without restricting the update range, often finalizing incomplete context and causing premature end-of-sequence predictions. BlockDiffusion updates fixed-size blocks in a preset order, but its rigid structure can break apart coherent semantic units and disrupt reasoning. We present WavefrontDiffusion, a dynamic decoding approach that expands a wavefront of active tokens outward from finalized positions. This adaptive process follows the natural flow of semantic structure while keeping computational cost equal to block-based methods. Across four benchmarks in reasoning and code generation, WavefrontDiffusion achieves state-of-the-art performance while producing outputs with higher semantic fidelity, showing the value of adaptive scheduling for more coherent and efficient generation.