Fine-tuning of diffusion models via stochastic control: entropy regularization and beyond

📅 2024-03-10
🏛️ arXiv.org
📈 Citations: 19
Influential: 5
📄 PDF
🤖 AI Summary
To address the prevalent reward collapse problem in diffusion model fine-tuning, this paper proposes an entropy-regularized stochastic control framework and— for the first time—rigorously extends it to general *f*-divergence regularization. Methodologically, we formulate a continuous-time stochastic control model, integrating Itô calculus with variational inference to derive a computationally tractable and provably convergent optimal control policy. Theoretically, we establish that the proposed regularization effectively mitigates reward collapse; empirically, it significantly improves both sample quality and diversity. Key contributions include: (1) the first rigorous stochastic control analysis framework specifically designed for diffusion model fine-tuning; (2) a unified generalization of entropy regularization to arbitrary *f*-divergences, substantially enhancing methodological generality and robustness; and (3) a practical fine-tuning paradigm implementable under multiple divergence metrics.

Technology Category

Application Category

📝 Abstract
This paper aims to develop and provide a rigorous treatment to the problem of entropy regularized fine-tuning in the context of continuous-time diffusion models, which was recently proposed by Uehara et al. (arXiv:2402.15194, 2024). The idea is to use stochastic control for sample generation, where the entropy regularizer is introduced to mitigate reward collapse. We also show how the analysis can be extended to fine-tuning involving a general $f$-divergence regularizer.
Problem

Research questions and friction points this paper is trying to address.

Developing rigorous entropy-regularized fine-tuning for diffusion models
Using stochastic control to prevent reward collapse during generation
Extending analysis to general f-divergence regularizers for fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuning diffusion models using stochastic control
Introducing entropy regularization to prevent reward collapse
Extending framework to general f-divergence regularizers
🔎 Similar Papers
No similar papers found.