Think While You Generate: Discrete Diffusion with Planned Denoising

📅 2024-10-08

🏛️ arXiv.org

📈 Citations: 4

✨ Influential: 0

career value

169K/year

🤖 AI Summary

To address the low denoising efficiency and reconstruction quality degradation caused by sequential randomness in discrete diffusion models, this paper proposes a “planning-based denoising” framework that decouples the process into a learnable position planner and a local denoiser, enabling on-demand identification of critical positions and precise token/image restoration. This two-stage paradigm overcomes inherent limitations of conventional mask diffusion—namely, uniform or random denoising—by introducing iterative adaptive mask selection and joint token/image training. Evaluated on text8, OpenWebText, and ImageNet 256×256, our method significantly outperforms state-of-the-art mask diffusion models: it achieves language modeling perplexity closely approaching autoregressive baselines while simultaneously improving both inference efficiency and fidelity in image generation.

Technology Category

Application Category

📝 Abstract

Discrete diffusion has achieved state-of-the-art performance, outperforming or approaching autoregressive models on standard benchmarks. In this work, we introduce Discrete Diffusion with Planned Denoising (DDPD), a novel framework that separates the generation process into two models: a planner and a denoiser. At inference time, the planner selects which positions to denoise next by identifying the most corrupted positions in need of denoising, including both initially corrupted and those requiring additional refinement. This plan-and-denoise approach enables more efficient reconstruction during generation by iteratively identifying and denoising corruptions in the optimal order. DDPD outperforms traditional denoiser-only mask diffusion methods, achieving superior results on language modeling benchmarks such as text8, OpenWebText, and token-based image generation on ImageNet $256 imes 256$. Notably, in language modeling, DDPD significantly reduces the performance gap between diffusion-based and autoregressive methods in terms of generative perplexity. Code is available at https://github.com/liusulin/DDPD.

Problem

Research questions and friction points this paper is trying to address.

Improving discrete diffusion models with planned denoising

Reducing performance gap between diffusion and autoregressive methods

Enhancing efficiency in language and image generation tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Planner and denoiser separate framework

Optimal order denoising for efficiency

Outperforms traditional mask diffusion methods

🔎 Similar Papers

No similar papers found.