Guided Star-Shaped Masked Diffusion

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Pretrained masked diffusion models suffer from irreversible decision-making and poor generation quality under low-step sampling. To address this, we propose the star-shaped sampling paradigm: it reformulates sequential generation as a multi-branch parallel re-generation process centered on the initial mask, and introduces a lightweight learnable remasking scheduler to dynamically optimize mask-updating strategies across branches. Our method requires only fine-tuning a single network layer to adapt existing models. Experiments demonstrate substantial improvements in generation quality and stability under ultra-low-step regimes (e.g., 4–8 steps) for both text and code generation tasks. It outperforms or matches state-of-the-art acceleration methods, and—crucially—enables the first efficient, correctable, non-Markovian generation process in masked diffusion models.

Technology Category

Application Category

📝 Abstract
The performance of pre-trained masked diffusion models is often constrained by their sampling procedure, which makes decisions irreversible and struggles in low-step generation regimes. We introduce a novel sampling algorithm that works with pre-trained models and, after a lightweight fine-tuning of a single layer, significantly improves sample quality and efficiency. Our method reformulates the generation process using a star-shaped paradigm, which inherently allows for error correction. To make this process effective, we augment it with a learnable re-masking scheduler that intelligently identifies and revises likely errors. This approach yields a substantial quality boost, particularly when using a small number of sampling steps. We extensively ablate key components of our approach and show its usability in different scenarios. In comprehensive experiments on text, and code generation, our sampling algorithm outperforms or matches existing methods.
Problem

Research questions and friction points this paper is trying to address.

Improving pre-trained masked diffusion models' sampling efficiency and quality
Enabling error correction through star-shaped generation paradigm
Enhancing low-step generation performance with learnable re-masking scheduler
Innovation

Methods, ideas, or system contributions that make the work stand out.

Star-shaped paradigm enables error correction
Learnable re-masking scheduler revises likely errors
Lightweight fine-tuning boosts quality and efficiency
🔎 Similar Papers
No similar papers found.
V
Viacheslav Meshchaninov
Constructor University, Higher School of Economics
E
Egor Shibaev
Constructor University
A
Artem Makoian
Constructor University
I
Ivan Klimov
Constructor University
D
Danil Sheshenya
Higher School of Economics
A
Andrei Malinin
Higher School of Economics
Nikita Balagansky
Nikita Balagansky
Central University
NLP
Daniil Gavrilov
Daniil Gavrilov
Unknown affiliation
Deep Learning
Aibek Alanov
Aibek Alanov
Higher School of Economics
bayesian methodsgenerative models
D
Dmitry P. Vetrov
Constructor University