ELF: Embedded Language Flows

๐Ÿ“… 2026-05-11
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

204K/year
๐Ÿค– AI Summary
Effectively applying continuous diffusion models to discrete language modeling remains challenging. This work proposes Embedded Language Flows (ELF), a novel approach that constructs a language diffusion model in a continuous embedding space using continuous-time flow matching, projecting back to discrete tokens only at the final step. ELF is the first method to enable efficient continuous diffusion for language modeling, allowing direct adaptation of techniques from image diffusionโ€”such as classifier-free guidance (CFG)โ€”and substantially reducing the number of required sampling steps. Experimental results demonstrate that ELF significantly outperforms existing discrete and continuous diffusion-based language models in generation quality while maintaining superior inference efficiency.
๐Ÿ“ Abstract
Diffusion and flow-based models have become the de facto approaches for generating continuous data, e.g., in domains such as images and videos. Their success has attracted growing interest in applying them to language modeling. Unlike their image-domain counterparts, today's leading diffusion language models (DLMs) primarily operate over discrete tokens. In this paper, we show that continuous DLMs can be made effective with minimal adaptation to the discrete domain. We propose Embedded Language Flows (ELF), a class of diffusion models in continuous embedding space based on continuous-time Flow Matching. Unlike existing DLMs, ELF predominantly stays within the continuous embedding space until the final time step, where it maps to discrete tokens using a shared-weight network. This formulation makes it straightforward to adapt established techniques from image-domain diffusion models, e.g., classifier-free guidance (CFG). Experiments show that ELF substantially outperforms leading discrete and continuous DLMs, achieving better generation quality with fewer sampling steps. These results suggest that ELF offers a promising path toward effective continuous DLMs.
Problem

Research questions and friction points this paper is trying to address.

diffusion language models
continuous embedding space
discrete tokens
language modeling
generative modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Embedded Language Flows
continuous diffusion models
Flow Matching
language modeling
classifier-free guidance
๐Ÿ”Ž Similar Papers
No similar papers found.