Consistent Diffusion Language Models

📅 2026-04-30

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

Discrete diffusion language models struggle with efficient parallel generation due to their reliance on numerous iterative steps. This work proposes the Multi-Path Discrete Consistency (MPDC) framework, which extends the continuous-domain consistency principle to discrete text generation for the first time. By enforcing an expected consistency constraint over stochastic posterior bridge trajectories, MPDC enables single-stage, teacher-free training. The approach unifies masked diffusion, continuous consistency modeling, and distillation paradigms through a closed-form discrete posterior bridge, yielding a multi-path consistency objective. Experiments demonstrate that MPDC achieves new state-of-the-art performance in both conditional and unconditional text generation tasks, significantly outperforming existing diffusion and distillation methods under few-step sampling regimes.

📝 Abstract

Diffusion language models (DLMs) are an attractive alternative to autoregressive models because they promise sublinear-time, parallel generation, yet practical gains remain elusive as high-quality samples still demand hundreds of refinement steps. In continuous domains, consistency training along the probability-flow ODE is a popular recipe to accelerate diffusion. For discrete diffusion, no analogous sample-space ODE exists, making direct adaptation ill-defined. We argue that the natural discrete substitute is not a deterministic trajectory but its stochastic counterpart: the exact posterior bridge, available in closed form for broad corruption families including masked and uniform diffusion. Building on this observation, we introduce Multi-Path Discrete Consistency (MPDC), a new principle that trains a denoiser to be path-invariant in expectation across these stochastic bridges, and instantiate it as the Consistent Diffusion Language Model (CDLM), a single-stage, teacher-free training framework. A single CDLM objective unifies masked diffusion, continuous consistency models, and progressive/discrete distillation as analytic limits or empirical approximations of one common view. Empirically, CDLM establishes a new state of the art on both conditional and unconditional text-generation, consistently outperforming strong base discrete diffusion models and often even multi-stage distilled baselines across sampling budgets, with the largest gains in the few-step regime. Together, these results position CDLM as a principled and scalable foundation for the next generation of fast, high-fidelity discrete generative modeling.

Problem

Research questions and friction points this paper is trying to address.

discrete diffusion

consistency training

language models

parallel generation

posterior bridge

Innovation

Methods, ideas, or system contributions that make the work stand out.

Discrete Diffusion

Consistency Training

Posterior Bridge