Test-time scaling of diffusions with flow maps

๐Ÿ“… 2025-11-27
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Diffusion models face ill-posedness when optimizing user-specified rewards at test time: reward functions are typically defined only at the final generation step, making direct gradient computation infeasible. To address this, we propose Flow Map Trajectory Tilting (FMTT), a principled algorithm that explicitly models the dynamic evolution of the generative processโ€”i.e., the flow map and its associated velocity fieldโ€”to backpropagate terminal reward gradients along the entire sampling trajectory. This enables trajectory-level tilting and importance-weighted sampling without requiring reward differentiability or global smoothness assumptions. FMTT naturally accommodates complex, black-box rewards (e.g., vision-language model scores) and supports precise, controllable image editing. Experiments demonstrate that FMTT significantly improves reward scores across diverse lookahead strategies while preserving high-fidelity output and edit consistency.

Technology Category

Application Category

๐Ÿ“ Abstract
A common recipe to improve diffusion models at test-time so that samples score highly against a user-specified reward is to introduce the gradient of the reward into the dynamics of the diffusion itself. This procedure is often ill posed, as user-specified rewards are usually only well defined on the data distribution at the end of generation. While common workarounds to this problem are to use a denoiser to estimate what a sample would have been at the end of generation, we propose a simple solution to this problem by working directly with a flow map. By exploiting a relationship between the flow map and velocity field governing the instantaneous transport, we construct an algorithm, Flow Map Trajectory Tilting (FMTT), which provably performs better ascent on the reward than standard test-time methods involving the gradient of the reward. The approach can be used to either perform exact sampling via importance weighting or principled search that identifies local maximizers of the reward-tilted distribution. We demonstrate the efficacy of our approach against other look-ahead techniques, and show how the flow map enables engagement with complicated reward functions that make possible new forms of image editing, e.g. by interfacing with vision language models.
Problem

Research questions and friction points this paper is trying to address.

Improves diffusion models using reward gradients
Proposes Flow Map Trajectory Tilting algorithm
Enables better reward optimization and image editing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Flow Map Trajectory Tilting algorithm for test-time scaling
Directly uses flow map instead of denoiser for reward gradient
Enables exact sampling and principled search with reward functions
๐Ÿ”Ž Similar Papers
2024-09-05Citations: 0