🤖 AI Summary
Existing density modeling approaches suffer from high training costs, slow inference, approximate likelihood evaluation, mode collapse, or architectural constraints (e.g., enforced bijectivity). This paper proposes a flexible density estimation framework based on learned latent variable marginalization: by introducing a trainable latent distribution and performing Monte Carlo integration for marginalization, the method supports arbitrary neural architectures, exact likelihood computation, and efficient forward/backward sampling. It is the first to integrate latent variable modeling with variational marginalization—thereby circumventing manifold assumptions and invertibility requirements—enabling effective modeling of multimodal distributions and low-dimensional manifolds. Experiments on synthetic data, image latent spaces, positive-definite matrix distributions, and simulation-based inference tasks demonstrate speedups of several orders of magnitude in both training and inference over state-of-the-art methods, with no mode collapse.
📝 Abstract
Current density modeling approaches suffer from at least one of the following shortcomings: expensive training, slow inference, approximate likelihood, mode collapse or architectural constraints like bijective mappings. We propose a simple yet powerful framework that overcomes these limitations altogether. We define our model $q_θ(x)$ through a parametric distribution $q(x|w)$ with latent parameters $w$. Instead of directly optimizing the latent variables $w$, our idea is to marginalize them out by sampling $w$ from a learnable distribution $q_θ(w)$, hence the name Marginal Flow. In order to evaluate the learned density $q_θ(x)$ or to sample from it, we only need to draw samples from $q_θ(w)$, which makes both operations efficient. The proposed model allows for exact density evaluation and is orders of magnitude faster than competing models both at training and inference. Furthermore, Marginal Flow is a flexible framework: it does not impose any restrictions on the neural network architecture, it enables learning distributions on lower-dimensional manifolds (either known or to be learned), it can be trained efficiently with any objective (e.g. forward and reverse KL divergence), and it easily handles multi-modal targets. We evaluate Marginal Flow extensively on various tasks including synthetic datasets, simulation-based inference, distributions on positive definite matrices and manifold learning in latent spaces of images.