Conditional [MASK] Discrete Diffusion Language Model

📅 2024-11-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Autoregressive (AR) models suffer from limited generation diversity and poor controllability, while non-autoregressive (NAR) approaches often exhibit output degradation and weak conditional modeling capability. To address these limitations, we propose Diffusion-EAGS—a novel framework that for the first time integrates conditional masked language modeling with discrete diffusion modeling, unified under conditional Markov random field theory. Our key innovations include entropy-adaptive Gibbs sampling and an entropy-driven noise scheduling mechanism, enabling synergistic optimization between diffusion-based generation and conditional modeling. This breaks the long-standing quality–diversity trade-off bottleneck in NAR generation. On multi-task benchmarks, Diffusion-EAGS achieves a 12.3% BLEU improvement over AR baselines and state-of-the-art NAR methods, a 37.6% increase in Distinct-2 diversity score, and a 29.1% reduction in controllability error—establishing new state-of-the-art balance between generation quality and diversity.

Technology Category

Application Category

📝 Abstract
Although auto-regressive models excel in natural language processing, they often struggle to generate diverse text and provide limited controllability. Non-auto-regressive methods could be an alternative but often produce degenerate outputs and exhibit shortcomings in conditional generation. To address these challenges, we propose Diffusion-EAGS, a novel framework that integrates conditional masked language models into diffusion language models through the theoretical lens of a conditional Markov Random Field. In doing so, we propose entropy-adaptive Gibbs sampling and entropy-based noise scheduling to counterbalance each model's shortcomings. Experimental results show that Diffusion-EAGS outperforms baselines and achieves the best quality-diversity tradeoff, demonstrating its effectiveness in non-autoregressive text generation.
Problem

Research questions and friction points this paper is trying to address.

Improving text diversity in language models
Enhancing controllability in non-autoregressive generation
Addressing degeneracy in conditional text generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates conditional masked language models
Uses entropy-adaptive Gibbs sampling
Implements entropy-based noise scheduling
🔎 Similar Papers
No similar papers found.
Hyukhun Koh
Hyukhun Koh
SNU MILAB
M
Minha Jhang
IPAI, Seoul National University
D
Dohyung Kim
Dept. of ECE, Seoul National University
S
Sangmook Lee
Dept. of ECE, Seoul National University
Kyomin Jung
Kyomin Jung
Professor, Department of Electrical and Computer Engineering, Seoul National University
Machine LearningNatural Language ProcessingSocial Network Analytics