Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning

📅 2025-09-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion language models suffer from semantic drift, repetition, and incoherence in long-text generation due to contextual decay induced by large decoding windows. To address this, we propose a convolutional normalized decoding mechanism that captures long-range dependencies via localized receptive fields—enabling effective context compression without chunking—and a rejection-based rule fine-tuning strategy that imposes explicit semantic consistency constraints during post-training. Together, these components enhance contextual fidelity and fluency in distant-token generation. Experiments demonstrate state-of-the-art performance on open-generation benchmarks (e.g., AlpacaEval), with a 37% reduction in generation steps, 2.1× inference speedup, and significant improvements in coherence and relevance.

Technology Category

Application Category

📝 Abstract
Autoregressive (AR) language models generate text one token at a time, which limits their inference speed. Diffusion-based language models offer a promising alternative, as they can decode multiple tokens in parallel. However, we identify a key bottleneck in current diffusion LMs: the long decoding-window problem, where tokens generated far from the input context often become irrelevant or repetitive. Previous solutions like semi-autoregressive address this issue by splitting windows into blocks, but this sacrifices speed and bidirectionality, eliminating the main advantage of diffusion models. To overcome this, we propose Convolutional decoding (Conv), a normalization-based method that narrows the decoding window without hard segmentation, leading to better fluency and flexibility. Additionally, we introduce Rejecting Rule-based Fine-Tuning (R2FT), a post-hoc training scheme that better aligns tokens at positions far from context. Our methods achieve state-of-the-art results on open-ended generation benchmarks (e.g., AlpacaEval) among diffusion LM baselines, with significantly lower step size than previous works, demonstrating both speed and quality improvements.
Problem

Research questions and friction points this paper is trying to address.

Addressing long decoding-window problem in diffusion language models
Improving fluency and relevance of distantly generated tokens
Maintaining parallel decoding speed while enhancing text quality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Convolutional decoding for window narrowing
Rejective fine-tuning for token alignment
Parallel token generation via diffusion models
🔎 Similar Papers
No similar papers found.
Y
Yeongbin Seo
Department of Artificial Intelligence, Yonsei University
Dongha Lee
Dongha Lee
Yonsei University
Data miningInformation retrievalNatural language processing
J
Jaehyung Kim
Department of Artificial Intelligence, Yonsei University
Jinyoung Yeo
Jinyoung Yeo
Yonsei University
Natural Language ProceesingLarge Language ModelsAI Agents