🤖 AI Summary
This work addresses the critical problem that the unmasking order significantly affects generation quality in Masked Diffusion Models (MDMs) during inference. To tackle this, we propose P2, a path-planning framework that reformulates the unmasking sequence as a learnable dynamic path-planning task. A lightweight planner—built upon BERT or denoiser features—iteratively selects optimal token positions. Theoretically, we derive an extended Evidence Lower Bound (ELBO) that unifies and characterizes existing MDM sampling strategies. Methodologically, P2 integrates autoregressive and non-autoregressive mechanisms, enabling task-adaptive planning. Extensive experiments demonstrate that P2 substantially improves both generation quality and sampling efficiency across diverse domains: language generation (e.g., in-context learning, code synthesis, mathematical reasoning) and biomolecular sequence design (e.g., protein and RNA generation).
📝 Abstract
In this paper, we investigate how the order in which tokens are unmasked during masked diffusion models (MDMs) inference affects generative quality. We derive an expanded evidence lower bound (ELBO) that introduces a planner, responsible for selecting which tokens to unmask at each step. Our analysis suggests that alternative unmasking strategies can improve generative performance. Based on these insights, we propose Path Planning (P2), a sampling framework that leverages pre-trained BERT or the denoiser itself to guide unmasking decisions. P2 generalizes all known MDM sampling strategies and enables significant improvements across diverse domains including language generation (in-context learning, code generation, story infilling, mathematical reasoning, reverse curse correction) and biological sequence generation (protein and RNA sequences).