🤖 AI Summary
This work addresses the limitations of existing inference-time scaling methods for diffusion models, which rely on external validators or reward models for sample selection and struggle in scenarios lacking reliable evaluators or involving sequential generation under region-wise mixed noise. The authors propose Iterative Partial Refinement (IPR), the first inference-time scaling mechanism for sequence diffusion models that operates without external validation. IPR dynamically enhances global consistency by selectively re-noising portions of a generated sample and performing localized regeneration conditioned on the remaining regions. Evaluated on tasks requiring strong global constraints—such as MNIST Sudoku—the method significantly improves the valid solution rate from 55.8% to 75.0%, thereby eliminating dependence on reward models while enhancing both generation quality and constraint satisfaction.
📝 Abstract
Inference-time scaling has emerged as a major approach for improving reasoning capabilities, and has been increasingly applied to diffusion models. However, existing inference-time scaling methods for diffusion models typically rely on external verifiers or reward models to rank and select samples, limiting their scalability to settings where such evaluators are available and reliable. Moreover, while recent diffusion models perform sequential inference with region-wise, mixed-noise conditioning, inference-time scaling tailored to this setting remains relatively underexplored. We propose Iterative Partial Refinement (IPR), an inference-time scaling method for sequential diffusion that requires no external verifier. Starting from an already-generated sample, IPR re-noises a subset of regions and regenerates them conditioned on the remaining regions, enabling the model to revise earlier decisions under a richer context than was available during the initial generation. This iterative partial refinement produces more globally consistent samples without external verification. On reasoning tasks requiring global constraint satisfaction, IPR consistently improves performance: on MNIST Sudoku, the valid solution rate increases from 55.8% to 75.0%. These results show that iterative partial refinement alone can serve as an effective inference-time scaling strategy for diffusion models in sequential, mixed-noise settings. Code is available at: https://github.com/ahn-ml/IPR