DUEL: Exact Likelihood for Masked Diffusion via Deterministic Unmasking

📅 2026-03-01

📈 Citations: 0

✨ Influential: 0

career value

231K/year

🤖 AI Summary

Masked diffusion models have been hindered in reliable perplexity evaluation due to the absence of exact likelihood computation under test-time distributions. This work proposes DUEL, a framework that unifies prevailing sampling strategies through a deterministic token selection mechanism and, for the first time, enables exact likelihood computation for masked diffusion models under arbitrary sampling strategies. The approach integrates deterministic unmasking, an exact likelihood algorithm, probability-margin sampling, and position-order optimization to substantially improve evaluation accuracy. Experiments demonstrate that DUEL narrows the perplexity gap with autoregressive models by up to 32% on in-domain data and by as much as 82% in zero-shot settings. On AG News, the optimal position ordering yields a significant advantage, achieving a perplexity of 36.47 compared to 52.11 with conventional ordering.

Technology Category

Application Category

📝 Abstract

Masked diffusion models (MDMs) generate text by iteratively selecting positions to unmask and then predicting tokens at those positions. Yet MDMs lack proper perplexity evaluation: the ELBO is a loose bound on likelihood under the training distribution, not the test-time distribution, while generative perplexity requires a biased external model and ignores diversity. To address this, we introduce the \textsc{DUEL} framework, which formalizes \emph{deterministic} position selection, unifying leading MDM sampling strategies. We prove \textbf{\textsc{DUEL} admits \emph{exact} likelihood computation} via a simple algorithm, evaluated under the same position selection used at test time. This \textbf{gives MDMs proper perplexity for the first time} -- the natural analogue of autoregressive perplexity. With proper perplexity in hand, we revisit key questions about MDMs. \textbf{MDMs are substantially better than previously thought}: the MDM-autoregressive perplexity gap shrinks by up to 32\% on in-domain data and 82\% on zero-shot benchmarks. \textsc{DUEL} enables the first principled comparison of fast, parallel samplers across compute budgets -- an analysis impossible with the ELBO and unreliable with generative perplexity -- identifying probability margin \citep{kim2025train} as a strong default. Finally, oracle search over position orderings reveals MDMs can far surpass autoregressive models -- achieving 36.47 vs.\ 52.11 perplexity on AG News -- demonstrating the ceiling of MDM performance has not yet been reached.

Problem

Research questions and friction points this paper is trying to address.

Masked Diffusion Models

perplexity

likelihood evaluation

ELBO

text generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Masked Diffusion Models

Exact Likelihood

Deterministic Unmasking