A unified perspective on fine-tuning and sampling with diffusion and flow models

📅 2026-04-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

213K/year
🤖 AI Summary
This work addresses the challenges of sampling and fine-tuning diffusion and flow models under exponentially tilted target distributions by proposing a unified framework grounded in stochastic optimal control and nonequilibrium thermodynamics. The framework integrates adjoint matching with a novel score-matching approach, introduces a bias-variance decomposition to characterize gradient variance properties, and establishes norm bounds for the adjoint ordinary differential equation. It further extends CMCD/NETS losses and the Crooks/Jarzynski identities to the exponentially tilted setting. Experiments on Stable Diffusion 1.5 and 3 demonstrate the efficacy of the proposed method, showing that adjoint-based approaches not only exhibit finite gradient variance but also significantly outperform baseline methods in reward fine-tuning and unnormalized density sampling tasks.
📝 Abstract
We study the problem of training diffusion and flow generative models to sample from target distributions defined by an exponential tilting of a base density; a formulation that subsumes both sampling from unnormalized densities and reward fine-tuning of pre-trained models. This problem can be approached from a stochastic optimal control (SOC) perspective, using adjoint-based or score matching methods, or from a non-equilibrium thermodynamics perspective. We provide a unified framework encompassing these approaches and make three main contributions: (i) bias-variance decompositions revealing that Adjoint Matching/Sampling and Novel Score Matching have finite gradient variance, while Target and Conditional Score Matching do not; (ii) norm bounds on the lean adjoint ODE that theoretically support the effectiveness of adjoint-based methods; and (iii) adaptations of the CMCD and NETS loss functions, along with novel Crooks and Jarzynski identities, to the exponential tilting setting. We validate our analysis with reward fine-tuning experiments on Stable Diffusion 1.5 and 3.
Problem

Research questions and friction points this paper is trying to address.

diffusion models
flow models
exponential tilting
reward fine-tuning
sampling
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion models
flow models
stochastic optimal control
exponential tilting
score matching