Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data

📅 2024-06-06
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Modeling the joint distribution of high-dimensional discrete variables (e.g., binary or categorical data) suffers from exponential computational complexity as the number of categories grows, severely limiting scalability. Method: This paper proposes a generative model based on the Random Assignment Flow (RAF), which models discrete distributions via measure transport on a statistical submanifold. It uniquely integrates *e*-connection geodesics from information geometry with conditional Riemannian flow matching, enabling simulation-free, end-to-end training. Contribution/Results: The model achieves linear time complexity in the number of affinity function parameters—bypassing the exponential barrier of conventional methods—while supporting efficient sampling and exact log-likelihood evaluation. Empirical validation on structured image annotation demonstrates superior scalability: performance degradation under increasing category count is markedly slower than state-of-the-art baselines, confirming its exceptional scalability and practical utility for large-scale discrete modeling.

Technology Category

Application Category

📝 Abstract
We introduce a novel generative model for the representation of joint probability distributions of a possibly large number of discrete random variables. The approach uses measure transport by randomized assignment flows on the statistical submanifold of factorizing distributions, which enables to represent and sample efficiently from any target distribution and to assess the likelihood of unseen data points. The complexity of the target distribution only depends on the parametrization of the affinity function of the dynamical assignment flow system. Our model can be trained in a simulation-free manner by conditional Riemannian flow matching, using the training data encoded as geodesics on the assignment manifold in closed-form, with respect to the e-connection of information geometry. Numerical experiments devoted to distributions of structured image labelings demonstrate the applicability to large-scale problems, which may include discrete distributions in other application areas. Performance measures show that our approach scales better with the increasing number of classes than recent related work.
Problem

Research questions and friction points this paper is trying to address.

Probability Representation
Discrete Distributions
Efficiency and Accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Probabilistic Processing
Riemann Flow Matching
Discrete Distribution Learning
🔎 Similar Papers