Forward-Learned Discrete Diffusion: Learning how to noise to denoise faster

📅 2026-05-18

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Traditional discrete diffusion models suffer from low generation efficiency due to their reliance on a fixed forward noising process and a factorized reverse process, which hinders accurate approximation of the target distribution within a limited number of sampling steps. This work proposes Forward-Learned Discrete Diffusion (FLDD), which introduces, for the first time in discrete diffusion, a learnable non-Markovian forward process. By end-to-end optimizing learnable marginal and posterior distributions, FLDD substantially reduces the gap between the model and the target distribution while preserving the factorized structure of the reverse process. Built upon a variational inference framework, FLDD consistently outperforms existing methods across multiple benchmark tasks under the same sampling budget.

📝 Abstract

Discrete diffusion models are a powerful class of generative models with strong performance across many domains. For efficiency, however, discrete diffusion typically parameterizes the generative (reverse) process with factorized distributions, which makes it difficult for the model to learn the target process in a small number of steps and necessitates a long, computationally expensive sampling procedure. To reduce the gap between the target and model distributions and enable few-step generation, we propose Forward-Learned Discrete Diffusion (FLDD), which introduces discrete diffusion with a learnable forward (noising) process. Rather than fixing a Markovian forward chain, we adopt a non-Markovian formulation with learnable marginal and posterior distributions. This allows the generative process to remain factorized while matching the target defined by the noising process. We train all parameters end-to-end under the standard variational objective. Experiments on various benchmarks show that, for a given number of sampling steps, our approach produces a higher quality samples than conventional discrete diffusion models using the same reverse parameterization.

Problem

Research questions and friction points this paper is trying to address.

discrete diffusion

generative models

sampling efficiency

factorized distributions

few-step generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

discrete diffusion

learnable forward process

non-Markovian diffusion