🤖 AI Summary
This paper addresses nonlinear dimensionality reduction for distributional data residing in the space of probability measures. We propose Geodesic Principal Component Analysis (GPCA) grounded in Otto–Wasserstein geometry. Methodologically, we introduce the first end-to-end differentiable GPCA framework by integrating Wasserstein geodesic parameterization with trainable neural networks, enabling learning on general absolutely continuous probability measures; for Gaussian distributions, we further incorporate a linear mapping space to enhance computational efficiency. Unlike classical tangent-space PCA, our approach faithfully captures the intrinsic nonlinear variation structure of distributional data. Extensive experiments on synthetic and real-world datasets demonstrate significant improvements in dimensionality reduction accuracy, reconstruction fidelity, and interpretability. The proposed method establishes a novel geometric deep learning paradigm for distributional data modeling.
📝 Abstract
This paper focuses on Geodesic Principal Component Analysis (GPCA) on a collection of probability distributions using the Otto-Wasserstein geometry. The goal is to identify geodesic curves in the space of probability measures that best capture the modes of variation of the underlying dataset. We first address the case of a collection of Gaussian distributions, and show how to lift the computations in the space of invertible linear maps. For the more general setting of absolutely continuous probability measures, we leverage a novel approach to parameterizing geodesics in Wasserstein space with neural networks. Finally, we compare to classical tangent PCA through various examples and provide illustrations on real-world datasets.