Path Planning for Masked Diffusion Model Sampling

📅 2025-02-05

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This work addresses the critical problem that the unmasking order significantly affects generation quality in Masked Diffusion Models (MDMs) during inference. To tackle this, we propose P2, a path-planning framework that reformulates the unmasking sequence as a learnable dynamic path-planning task. A lightweight planner—built upon BERT or denoiser features—iteratively selects optimal token positions. Theoretically, we derive an extended Evidence Lower Bound (ELBO) that unifies and characterizes existing MDM sampling strategies. Methodologically, P2 integrates autoregressive and non-autoregressive mechanisms, enabling task-adaptive planning. Extensive experiments demonstrate that P2 substantially improves both generation quality and sampling efficiency across diverse domains: language generation (e.g., in-context learning, code synthesis, mathematical reasoning) and biomolecular sequence design (e.g., protein and RNA generation).

Technology Category

Application Category

📝 Abstract

In this paper, we investigate how the order in which tokens are unmasked during masked diffusion models (MDMs) inference affects generative quality. We derive an expanded evidence lower bound (ELBO) that introduces a planner, responsible for selecting which tokens to unmask at each step. Our analysis suggests that alternative unmasking strategies can improve generative performance. Based on these insights, we propose Path Planning (P2), a sampling framework that leverages pre-trained BERT or the denoiser itself to guide unmasking decisions. P2 generalizes all known MDM sampling strategies and enables significant improvements across diverse domains including language generation (in-context learning, code generation, story infilling, mathematical reasoning, reverse curse correction) and biological sequence generation (protein and RNA sequences).

Problem

Research questions and friction points this paper is trying to address.

Optimize token unmasking order

Enhance generative quality

Generalize MDM sampling strategies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Path Planning for unmasking tokens

Leverages pre-trained BERT

Improves generative performance

🔎 Similar Papers

Amortized Posterior Sampling with Diffusion Prior Distillation