Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing

📅 2025-03-25

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Pretrained flow models suffer from limited adaptability to diffusion-style inference-time scaling due to their deterministic generation mechanism. To address this, we propose a novel inference-time scaling framework grounded in the Variance-Preserving Stochastic Differential Equation (VP-SDE). Our method comprises three key components: (1) the first integration of particle-based sampling into flow models to enable stochastic generation; (2) an interpolation-based transition mechanism that explicitly expands the latent search space; and (3) an adaptive Rolling Budget Forcing (RBF) strategy that dynamically allocates computational budget across timesteps. Experiments on image and video generation demonstrate substantial improvements in both sample quality and diversity, consistently outperforming existing inference-time scaling approaches for flow models. Our framework establishes a new paradigm for controllable and efficient post-hoc refinement of deterministic generative models.

Technology Category

Application Category

📝 Abstract

We propose an inference-time scaling approach for pretrained flow models. Recently, inference-time scaling has gained significant attention in LLMs and diffusion models, improving sample quality or better aligning outputs with user preferences by leveraging additional computation. For diffusion models, particle sampling has allowed more efficient scaling due to the stochasticity at intermediate denoising steps. On the contrary, while flow models have gained popularity as an alternative to diffusion models--offering faster generation and high-quality outputs in state-of-the-art image and video generative models--efficient inference-time scaling methods used for diffusion models cannot be directly applied due to their deterministic generative process. To enable efficient inference-time scaling for flow models, we propose three key ideas: 1) SDE-based generation, enabling particle sampling in flow models, 2) Interpolant conversion, broadening the search space and enhancing sample diversity, and 3) Rollover Budget Forcing (RBF), an adaptive allocation of computational resources across timesteps to maximize budget utilization. Our experiments show that SDE-based generation, particularly variance-preserving (VP) interpolant-based generation, improves the performance of particle sampling methods for inference-time scaling in flow models. Additionally, we demonstrate that RBF with VP-SDE achieves the best performance, outperforming all previous inference-time scaling approaches.

Problem

Research questions and friction points this paper is trying to address.

Enable efficient inference-time scaling for flow models

Apply particle sampling to deterministic flow generation

Optimize computational budget allocation across timesteps

Innovation

Methods, ideas, or system contributions that make the work stand out.

SDE-based generation enables particle sampling

Interpolant conversion enhances sample diversity

Rollover Budget Forcing maximizes computational efficiency

🔎 Similar Papers

No similar papers found.