Unlocking Prompt Infilling Capability for Diffusion Language Models

📅 2026-04-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
While existing masked diffusion language models possess bidirectional denoising capabilities, they struggle to effectively support prompt infilling, limiting their utility in prompt engineering. This work identifies this limitation as stemming from the training paradigm rather than the model architecture itself and proposes a simple yet effective full-sequence joint masking supervised fine-tuning approach—simultaneously masking both prompts and responses—to unlock the model’s prompt infilling capacity. Without any architectural modifications, the method enables the model to automatically generate high-quality prompt templates from only a few examples. It demonstrates strong transferability across diverse downstream tasks and model variants, achieving performance on par with or even surpassing manually crafted prompts, while remaining complementary to existing prompt optimization techniques.
📝 Abstract
Masked diffusion language models (dLMs) generate text through bidirectional denoising, yet this capability remains locked for infilling prompts. This limitation is an artifact of the current supervised finetuning (SFT) convention of applying response-only masking. To unlock this capability, we extend full-sequence masking during SFT, where both prompts and responses are masked jointly. Once unlocked, the model infills masked portions of a prompt template conditioned on few-shot examples. We show that such model-infilled prompts match or surpass manually designed templates, transfer effectively across models, and are complementary to existing prompt optimization methods. Our results suggest that training practices, not architectural limitations, are the primary bottleneck preventing masked diffusion language models from infilling effective prompts
Problem

Research questions and friction points this paper is trying to address.

prompt infilling
diffusion language models
masked language modeling
supervised fine-tuning
prompt engineering
Innovation

Methods, ideas, or system contributions that make the work stand out.

masked diffusion language models
prompt infilling
full-sequence masking
supervised fine-tuning
prompt optimization
🔎 Similar Papers
No similar papers found.