A Tale of Two Temperatures: Simple, Efficient, and Diverse Sampling from Diffusion Language Models

📅 2026-04-10
📈 Citations: 0
Influential: 0
📄 PDF

career value

208K/year
🤖 AI Summary
Existing sampling methods for diffusion language models struggle to balance generation speed, quality, and diversity. This work proposes a temperature-modulated confidence-based remasking heuristic that enhances exploration during sampling by theoretically modeling the entropy change of branching tokens. The approach effectively narrows the gap between diffusion and autoregressive models in terms of output diversity. Integrated within the pass@k evaluation framework, the method significantly improves the pass@NFE metric without incurring additional computational overhead, thereby enhancing computational scalability for downstream tasks during both training and inference.

Technology Category

Application Category

📝 Abstract
Much work has been done on designing fast and accurate sampling for diffusion language models (dLLMs). However, these efforts have largely focused on the tradeoff between speed and quality of individual samples; how to additionally ensure diversity across samples remains less well understood. In this work, we show that diversity can be increased by using softened, tempered versions of familiar confidence-based remasking heuristics, retaining their computational benefits and offering simple implementations. We motivate this approach by introducing an idealized formal model of fork tokens and studying the impact of remasking on the expected entropy at the forks. Empirically, the proposed tempered heuristics close the exploration gap (pass@k) between existing confidence-based and autoregressive sampling, hence outperforming both when controlling for cost (pass@NFE). We further study how the increase in diversity translates to downstream post-training and test-time compute scaling. Overall, our findings demonstrate that simple, efficient, and diverse sampling from dLLMs is possible.
Problem

Research questions and friction points this paper is trying to address.

diffusion language models
sampling diversity
remasking heuristics
exploration gap
pass@k
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion language models
tempered sampling
remasking heuristics
sample diversity
fork tokens
🔎 Similar Papers