Amortizing intractable inference in diffusion models for vision, language, and control

📅 2024-05-31
🏛️ arXiv.org
📈 Citations: 14
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion models face fundamental challenges in unsupervised posterior inference—e.g., zero-shot compositional modeling and offline RL policy optimization—when the prior is a diffusion process and constraints are black-box, rendering conventional posterior sampling intractable. To address this, we propose the Relative Trajectory Balance (RTB) objective, the first to provably guarantee asymptotically correct, data-agnostic learning of the constrained posterior. RTB enables unbiased, scalable sampling from the true posterior under arbitrary black-box constraints. Our method unifies generative flow networks with deep reinforcement learning, supporting both discrete and continuous diffusion dynamics, score matching, and behavioral prior encoding. Evaluated across classifier-guided generation, discrete language infilling, text-to-image synthesis, and offline RL, RTB achieves state-of-the-art performance, significantly improving both posterior sampling accuracy and mode coverage diversity.

Technology Category

Application Category

📝 Abstract
Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data, $mathbf{x}sim p^{ m post}(mathbf{x})propto p(mathbf{x})r(mathbf{x})$, in a model that consists of a diffusion generative model prior $p(mathbf{x})$ and a black-box constraint or likelihood function $r(mathbf{x})$. We state and prove the asymptotic correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from this posterior, a problem that existing methods solve only approximately or in restricted cases. Relative trajectory balance arises from the generative flow network perspective on diffusion models, which allows the use of deep reinforcement learning techniques to improve mode coverage. Experiments illustrate the broad potential of unbiased inference of arbitrary posteriors under diffusion priors: in vision (classifier guidance), language (infilling under a discrete diffusion LLM), and multimodal data (text-to-image generation). Beyond generative modeling, we apply relative trajectory balance to the problem of continuous control with a score-based behavior prior, achieving state-of-the-art results on benchmarks in offline reinforcement learning.
Problem

Research questions and friction points this paper is trying to address.

Diffusion Models
Reinforcement Learning
Predictive Modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Relative Trajectory Balancing
Diffusion Models
Reinforcement Learning
🔎 Similar Papers
2022-09-02ACM Computing SurveysCitations: 1628
Siddarth Venkatraman
Siddarth Venkatraman
Mila, University of Montreal
Artificial IntelligenceRobotics
Moksh Jain
Moksh Jain
Mila, Université de Montréal
probabilistic machine learning
Luca Scimeca
Luca Scimeca
Postdoctoral Research Fellow at Mila AI Institute
Deep LearningComputer VisionProbabilistic InferenceScientific DiscoveryRobotics
M
Minsu Kim
Mila – Québec AI Institute, Université de Montréal, KAIST
Marcin Sendera
Marcin Sendera
PhD Student, Jagiellonian University, Research Intern at Mila - Quebec AI Institute,
deep learningmeta-learningfew-shot learninggenerative modelsnormalizing flows
M
Mohsin Hasan
Mila – Québec AI Institute, Université de Montréal
Luke Rowe
Luke Rowe
Ph.D. Student, Mila / University of Montreal
Autonomous DrivingMachine LearningComputer Vision
Sarthak Mittal
Sarthak Mittal
Mila, Université de Montréal
Generative ModelsProbabilistic ModelsBayesian InferenceMachine Learning
Pablo Lemos
Pablo Lemos
Mila - Université de Montreal
CosmologyMachine LearningBayesian InferenceGenerative models
Emmanuel Bengio
Emmanuel Bengio
McGill University; Recursion/Valence Labs
Machine LearningDeep LearningReinforcement learning
Alexandre Adam
Alexandre Adam
University of Montreal
AstrophysicsCosmologyDeep Learning
Jarrid Rector-Brooks
Jarrid Rector-Brooks
Université de Montréal, Mila, Caltech
Generative modelingMachine learningComputer science
Yoshua Bengio
Yoshua Bengio
Professor of computer science, University of Montreal, Mila, IVADO, CIFAR
Machine learningdeep learningartificial intelligence
Glen Berseth
Glen Berseth
Assitant Professor - Université de Montréal
Reinforcement LearningRoboticsDeep LearningMachine Learning
Nikolay Malkin
Nikolay Malkin
University of Edinburgh