๐ค AI Summary
This work addresses the limitation of diffusion language models in infilling tasks, where predefined sequence lengths hinder adaptive determination of the optimal fill length. The authors identify two phenomena in the confidence of the first-step denoising: an โOracle Peakโ and a โLength Bias.โ Leveraging these insights, they propose CAL (Confidence-based Adaptive Length), a training-free method that efficiently searches for an approximately optimal length prior to formal decoding through confidence analysis and bias calibration. CAL is the first approach to achieve implicit, training-free length adaptation in diffusion-based infilling. On code infilling benchmarks, it improves Pass@1 by up to 47.7% over fixed-length and chat-based adaptive baselines; on textual infilling, it yields relative gains of 8.5% in BLEU-2 and 9.9% in ROUGE-L.
๐ Abstract
Diffusion language models (DLMs) provide a bidirectional generation framework naturally suited for infilling, yet their performance is constrained by the pre-specified infilling length. In this paper, we reveal that DLMs possess an inherent ability to discover the correct infilling length. We identify two key statistical phenomena in the first-step denoising confidence: a local \textit{Oracle Peak} that emerges near the ground-truth length and a systematic \textit{Length Bias} that often obscures this signal. By leveraging this signal and calibrating the bias, our training-free method \textbf{CAL} (\textbf{C}alibrated \textbf{A}daptive \textbf{L}ength) enables DLMs to approximate the optimal length through an efficient search before formal decoding. Empirical evaluations demonstrate that CAL improves Pass@1 by up to 47.7\% over fixed-length baselines and 40.5\% over chat-based adaptive methods in code infilling, while boosting BLEU-2 and ROUGE-L by up to 8.5\% and 9.9\% in text infilling. These results demonstrate that CAL paves the way for robust DLM infilling without requiring any specialized training. Code is available at https://github.com/NiuHechang/Calibrated_Adaptive_Length.