🤖 AI Summary
This work addresses the challenges of modeling complex dependencies and mitigating redundancy in high-dimensional parameter spaces for discrete data generation. It introduces, for the first time, a Riemannian geometric structure with isometric properties into the exponential parameter space of product manifolds over categorical distributions, thereby constructing a low-dimensional latent subspace. By leveraging the Riemannian metric, geodesics within this subspace become straight lines, enabling consistent and efficient flow-matching training. The proposed approach substantially reduces the dimensionality of latent variables while preserving strong representational capacity for discrete data distributions. Experimental results demonstrate that the model achieves accurate and efficient discrete data generation using a significantly lower-dimensional latent space, effectively balancing computational efficiency with modeling performance.
📝 Abstract
We introduce the use of latent subspaces in the exponential parameter space of product manifolds of categorial distributions, as a tool for learning generative models of discrete data. The low-dimensional latent space encodes statistical dependencies and removes redundant degrees of freedom among the categorial variables. We equip the parameter domain with a Riemannian geometry such that the spaces and distances are related by isometries which enables consistent flow matching. In particular, geodesics become straight lines which makes model training by flow matching effective. Empirical results demonstrate that reduced latent dimensions suffice to represent data for generative modeling.