🤖 AI Summary
This work addresses the challenge of efficiently adapting diffusion models to reward distributions under gradient-free conditions. We propose Iterative Reward Tilting (IRT), a gradient-free fine-tuning method that decomposes large reward tilting into a sequence of small, incremental adjustments. Each iteration relies solely on forward evaluations of the reward function and a first-order Taylor expansion to update the score function—eliminating the need for backpropagation or derivative computation along sampling trajectories. As the first iterative tilting framework supporting closed-form verification, provably stable convergence, and zero gradient backpropagation, IRT achieves exact convergence to the theoretical optimum in both 2D Gaussian mixture and linear reward settings. Empirical results demonstrate substantial improvements in fine-tuning efficiency and numerical stability compared to existing gradient-free approaches.
📝 Abstract
We introduce iterative tilting, a gradient-free method for fine-tuning diffusion models toward reward-tilted distributions. The method decomposes a large reward tilt $exp(lambda r)$ into $N$ sequential smaller tilts, each admitting a tractable score update via first-order Taylor expansion. This requires only forward evaluations of the reward function and avoids backpropagating through sampling chains. We validate on a two-dimensional Gaussian mixture with linear reward, where the exact tilted distribution is available in closed form.