DiLaDiff: Distilled Latent-Augmented Diffusion for Language Modeling

📅 2026-05-22
📈 Citations: 0
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
Diffusion language models struggle to balance generation quality and inference efficiency due to difficulties in modeling inter-token dependencies. This work proposes a novel latent-space-guided diffusion generation paradigm: it first constructs a semantically continuous latent space using a fine-tuned autoencoder, then designs a diffusion prior over latent variables and integrates a consistency distillation mechanism. The approach outperforms existing baselines even without distillation while accelerating inference; when distillation is applied, the overhead of latent variable generation becomes negligible, substantially reducing overall computational cost without compromising high-quality text generation.
📝 Abstract
Diffusion language models intrinsically fail to capture correlations between decoded tokens, which leads to a harsh trade-off between sampling quality and throughput. To solve this issue, we propose DiLaDiff, a variant of masked diffusion language models with three components: (1) a continuous latent space with semantic capabilities, learned by an auto-encoder fine-tuned from an existing masked diffusion language model; (2) a latent diffusion model learning the prior over the encoder distribution; (3) a consistency model distilling the learned prior into a few-step latent generative model. We show that, even without distillation, our latent-guided diffusion model outperforms the masked diffusion baseline while significantly accelerating inference. Consistency distillation further lowers the computational overhead of continuous diffusion, such that the latent is generated in negligible time compared to discrete decoding.
Problem

Research questions and friction points this paper is trying to address.

diffusion language models
token correlations
sampling quality
throughput
inference efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

latent diffusion
consistency distillation
masked diffusion language model
semantic latent space
accelerated inference
🔎 Similar Papers
No similar papers found.