Forward-Learned Discrete Diffusion: Learning how to noise to denoise faster

📅 2026-05-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

213K/year
🤖 AI Summary
Traditional discrete diffusion models suffer from low generation efficiency due to their reliance on a fixed forward noising process and a factorized reverse process, which hinders accurate approximation of the target distribution within a limited number of sampling steps. This work proposes Forward-Learned Discrete Diffusion (FLDD), which introduces, for the first time in discrete diffusion, a learnable non-Markovian forward process. By end-to-end optimizing learnable marginal and posterior distributions, FLDD substantially reduces the gap between the model and the target distribution while preserving the factorized structure of the reverse process. Built upon a variational inference framework, FLDD consistently outperforms existing methods across multiple benchmark tasks under the same sampling budget.
📝 Abstract
Discrete diffusion models are a powerful class of generative models with strong performance across many domains. For efficiency, however, discrete diffusion typically parameterizes the generative (reverse) process with factorized distributions, which makes it difficult for the model to learn the target process in a small number of steps and necessitates a long, computationally expensive sampling procedure. To reduce the gap between the target and model distributions and enable few-step generation, we propose Forward-Learned Discrete Diffusion (FLDD), which introduces discrete diffusion with a learnable forward (noising) process. Rather than fixing a Markovian forward chain, we adopt a non-Markovian formulation with learnable marginal and posterior distributions. This allows the generative process to remain factorized while matching the target defined by the noising process. We train all parameters end-to-end under the standard variational objective. Experiments on various benchmarks show that, for a given number of sampling steps, our approach produces a higher quality samples than conventional discrete diffusion models using the same reverse parameterization.
Problem

Research questions and friction points this paper is trying to address.

discrete diffusion
generative models
sampling efficiency
factorized distributions
few-step generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

discrete diffusion
learnable forward process
non-Markovian diffusion
few-step generation
end-to-end training
🔎 Similar Papers
No similar papers found.