Generalized Interpolating Discrete Diffusion

📅 2025-03-06

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Existing discrete diffusion language models (e.g., masked diffusion) achieve reasonable performance but lack the ability to correct previously generated tokens, limiting output quality. To address this, we propose Generalized Interpolative Discrete Diffusion (GIDD), the first unified theoretical framework for interpolative discrete diffusion, along with a novel diffusion variational lower bound (ELBO). GIDD introduces a generalized interpolative noise schedule and a hybrid masking-uniform noise injection strategy, enabling controllable noise injection and self-correcting sampling—allowing dynamic error correction during sequence generation. Under identical computational budgets, GIDD achieves state-of-the-art performance in diffusion-based language modeling, significantly improving sample quality and textual coherence. The code and pretrained models are publicly released.

Technology Category

Application Category

📝 Abstract

While state-of-the-art language models achieve impressive results through next-token prediction, they have inherent limitations such as the inability to revise already generated tokens. This has prompted exploration of alternative approaches such as discrete diffusion. However, masked diffusion, which has emerged as a popular choice due to its simplicity and effectiveness, reintroduces this inability to revise words. To overcome this, we generalize masked diffusion and derive the theoretical backbone of a family of general interpolating discrete diffusion (GIDD) processes offering greater flexibility in the design of the noising processes. Leveraging a novel diffusion ELBO, we achieve compute-matched state-of-the-art performance in diffusion language modeling. Exploiting GIDD's flexibility, we explore a hybrid approach combining masking and uniform noise, leading to improved sample quality and unlocking the ability for the model to correct its own mistakes, an area where autoregressive models notoriously have struggled. Our code and models are open-source: https://github.com/dvruette/gidd/

Problem

Research questions and friction points this paper is trying to address.

Overcoming limitations of next-token prediction in language models

Generalizing masked diffusion for flexible noising processes

Enabling models to revise and correct generated tokens

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generalized Interpolating Discrete Diffusion (GIDD)

Novel diffusion ELBO for improved performance

Hybrid approach combining masking and uniform noise

🔎 Similar Papers

Diffusion Models: A Comprehensive Survey of Methods and Applications