Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models

📅 2025-02-10
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiency of conditional posterior inference—i.e., sampling from (p(x mid y))—in generative models. We propose a novel diffusion-sampling paradigm operating in the noise latent space (z) rather than the data space (x). Our core innovation is “outsourcing”: shifting complex posterior inference to a smoother Gaussian noise space, and jointly optimizing the diffusion process and an implicit reparameterization mapping (f_ heta(z)) via policy gradient methods (PPO/TRPO). The resulting framework is architecture-agnostic—compatible with pre-trained GANs, VAEs, and normalizing flows—and plug-and-play, requiring no modification to existing unconditional generators. It enables end-to-end amortized inference without task-specific retraining. Experiments on image conditional generation, human-feedback-driven RL fine-tuning, and protein structure modeling demonstrate substantial improvements over both amortized and non-amortized baselines.

Technology Category

Application Category

📝 Abstract
Any well-behaved generative model over a variable $mathbf{x}$ can be expressed as a deterministic transformation of an exogenous ('outsourced') Gaussian noise variable $mathbf{z}$: $mathbf{x}=f_ heta(mathbf{z})$. In such a model (e.g., a VAE, GAN, or continuous-time flow-based model), sampling of the target variable $mathbf{x} sim p_ heta(mathbf{x})$ is straightforward, but sampling from a posterior distribution of the form $p(mathbf{x}midmathbf{y}) propto p_ heta(mathbf{x})r(mathbf{x},mathbf{y})$, where $r$ is a constraint function depending on an auxiliary variable $mathbf{y}$, is generally intractable. We propose to amortize the cost of sampling from such posterior distributions with diffusion models that sample a distribution in the noise space ($mathbf{z}$). These diffusion samplers are trained by reinforcement learning algorithms to enforce that the transformed samples $f_ heta(mathbf{z})$ are distributed according to the posterior in the data space ($mathbf{x}$). For many models and constraints of interest, the posterior in the noise space is smoother than the posterior in the data space, making it more amenable to such amortized inference. Our method enables conditional sampling under unconditional GAN, (H)VAE, and flow-based priors, comparing favorably both with current amortized and non-amortized inference methods. We demonstrate the proposed outsourced diffusion sampling in several experiments with large pretrained prior models: conditional image generation, reinforcement learning with human feedback, and protein structure generation.
Problem

Research questions and friction points this paper is trying to address.

Efficient posterior sampling in generative models
Amortized inference using diffusion models
Conditional sampling with GANs, VAEs, and flow-based models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion models in noise space
Reinforcement learning for sampling
Amortized posterior inference
🔎 Similar Papers
S
S. Venkatraman
Mila – Qu´ebec AI Institute, Universit´e de Montr´eal
M
Mohsin Hasan
Mila – Qu´ebec AI Institute, Universit´e de Montr´eal
M
Minsu Kim
Mila – Qu´ebec AI Institute, Universit´e de Montr´eal, KAIST
Luca Scimeca
Luca Scimeca
Postdoctoral Research Fellow at Mila AI Institute
Deep LearningComputer VisionProbabilistic InferenceScientific DiscoveryRobotics
Marcin Sendera
Marcin Sendera
PhD Student, Jagiellonian University, Research Intern at Mila - Quebec AI Institute,
deep learningmeta-learningfew-shot learninggenerative modelsnormalizing flows
Y
Y. Bengio
Mila – Qu´ebec AI Institute, Universit´e de Montr´eal, CIFAR
Glen Berseth
Glen Berseth
Assitant Professor - UniversitĂŠ de MontrĂŠal
Reinforcement LearningRoboticsDeep LearningMachine Learning
Nikolay Malkin
Nikolay Malkin
University of Edinburgh