🤖 AI Summary
Continuous diffusion models suffer from two key limitations: (i) local adjacency in the forward process impedes long-range transitions, and (ii) time non-uniformity in reverse denoising introduces bias. This paper proposes Quantized Transition Diffusion (QTD), which maps continuous data to a structured discrete latent space via histogram approximation and binary encoding. It designs a continuous-time Markov chain (CTMC) forward process based on Hamming distance—enabling efficient long-range jumps. Under the minimal score assumption, QTD achieves near-linear convergence for the first time, unifying discrete and continuous diffusion paradigms. Furthermore, it introduces truncated uniformization for unbiased reverse sampling, with theoretical guarantees of exactness. Algorithmically, QTD approximates a $d$-dimensional target distribution within error $varepsilon$ using only $O(d ln^2(d/varepsilon))$ expected score evaluations, attaining state-of-the-art inference efficiency while significantly improving both convergence bounds and sample quality.
📝 Abstract
Continuous diffusion models have demonstrated remarkable performance in data generation across various domains, yet their efficiency remains constrained by two critical limitations: (1) the local adjacency structure of the forward Markov process, which restricts long-range transitions in the data space, and (2) inherent biases introduced during the simulation of time-inhomogeneous reverse denoising processes. To address these challenges, we propose Quantized Transition Diffusion (QTD), a novel approach that integrates data quantization with discrete diffusion dynamics. Our method first transforms the continuous data distribution $p_*$ into a discrete one $q_*$ via histogram approximation and binary encoding, enabling efficient representation in a structured discrete latent space. We then design a continuous-time Markov chain (CTMC) with Hamming distance-based transitions as the forward process, which inherently supports long-range movements in the original data space. For reverse-time sampling, we introduce a extit{truncated uniformization} technique to simulate the reverse CTMC, which can provably provide unbiased generation from $q_*$ under minimal score assumptions. Through a novel KL dynamic analysis of the reverse CTMC, we prove that QTD can generate samples with $O(dln^2(d/epsilon))$ score evaluations in expectation to approximate the $d$--dimensional target distribution $p_*$ within an $epsilon$ error tolerance. Our method not only establishes state-of-the-art inference efficiency but also advances the theoretical foundations of diffusion-based generative modeling by unifying discrete and continuous diffusion paradigms.