A Survey on Diffusion Language Models

📅 2025-08-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion language models (DLMs) hold promise for high-quality, parallelizable text generation but suffer from inefficiency, weak long-sequence modeling, and substantial infrastructure overhead. To address these challenges, this work conducts a systematic literature review and establishes the first unified technical taxonomy for DLMs, clarifying their theoretical relationships with autoregressive and masked language models. We comprehensively analyze key components—including pretraining and post-training paradigms, iterative denoising mechanisms, cache-augmented parallel decoding, generation quality optimization, and multimodal fusion architectures. Empirical evaluation demonstrates that our integrated methodology achieves multi-fold inference speedup while maintaining generation quality competitive with state-of-the-art autoregressive models. This work provides a foundational framework and practical guidelines for advancing DLM research, engineering optimization, and cross-modal extension.

Technology Category

Application Category

📝 Abstract
Diffusion Language Models (DLMs) are rapidly emerging as a powerful and promising alternative to the dominant autoregressive (AR) paradigm. By generating tokens in parallel through an iterative denoising process, DLMs possess inherent advantages in reducing inference latency and capturing bidirectional context, thereby enabling fine-grained control over the generation process. While achieving a several-fold speed-up, recent advancements have allowed DLMs to show performance comparable to their autoregressive counterparts, making them a compelling choice for various natural language processing tasks. In this survey, we provide a holistic overview of the current DLM landscape. We trace its evolution and relationship with other paradigms, such as autoregressive and masked language models, and cover both foundational principles and state-of-the-art models. Our work offers an up-to-date, comprehensive taxonomy and an in-depth analysis of current techniques, from pre-training strategies to advanced post-training methods. Another contribution of this survey is a thorough review of DLM inference strategies and optimizations, including improvements in decoding parallelism, caching mechanisms, and generation quality. We also highlight the latest approaches to multimodal extensions of DLMs and delineate their applications across various practical scenarios. Furthermore, our discussion addresses the limitations and challenges of DLMs, including efficiency, long-sequence handling, and infrastructure requirements, while outlining future research directions to sustain progress in this rapidly evolving field. Project GitHub is available at https://github.com/VILA-Lab/Awesome-DLMs.
Problem

Research questions and friction points this paper is trying to address.

Exploring Diffusion Language Models as an alternative to autoregressive models
Analyzing DLM advantages in parallel generation and bidirectional context
Reviewing DLM limitations and future research directions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel token generation via iterative denoising
Bidirectional context capture for fine-grained control
Inference optimizations like decoding parallelism and caching
🔎 Similar Papers
No similar papers found.
T
Tianyi Li
VILA Lab, Mohamed bin Zayed University of Artificial Intelligence
Mingda Chen
Mingda Chen
FAIR, Meta
Natural Language ProcessingMachine Learning
B
Bowei Guo
VILA Lab, Mohamed bin Zayed University of Artificial Intelligence
Z
Zhiqiang Shen
VILA Lab, Mohamed bin Zayed University of Artificial Intelligence