🤖 AI Summary
This work addresses manifold learning for the space of absolutely continuous probability measures $mathcal{P}_{mathrm{a.c.}}(Omega)$—endowed with the Wasserstein-2 metric—where $Omega subset mathbb{R}^d$ is compact and convex. We propose the first extrinsic, distance-based implicit manifold modeling framework grounded solely in Wasserstein distances. Methodologically, we construct locally linearizable non-flat Wasserstein submanifolds and estimate tangent spaces via spectral analysis of the covariance operator associated with optimal transport maps, using only pairwise Wasserstein distances between samples. Theoretically, we prove that, as sample density tends to infinity, the distance graph asymptotically recovers the intrinsic metric structure of the manifold. Empirically, tangent spaces are reconstructed with high accuracy. Our key contribution lies in transcending Euclidean assumptions: we establish the first rigorous theoretical and algorithmic foundation for distance-driven manifold learning directly in the Wasserstein space.
📝 Abstract
This paper aims at building the theoretical foundations for manifold learning algorithms in the space of absolutely continuous probability measures $mathcal{P}_{mathrm{a.c.}}(Omega)$ with $Omega$ a compact and convex subset of $mathbb{R}^d$, metrized with the Wasserstein-2 distance $mathbb{W}$. We begin by introducing a construction of submanifolds $Lambda$ in $mathcal{P}_{mathrm{a.c.}}(Omega)$ equipped with metric $mathbb{W}_Lambda$, the geodesic restriction of $mathbb{W}$ to $Lambda$. In contrast to other constructions, these submanifolds are not necessarily flat, but still allow for local linearizations in a similar fashion to Riemannian submanifolds of $mathbb{R}^d$. We then show how the latent manifold structure of $(Lambda,mathbb{W}_{Lambda})$ can be learned from samples ${lambda_i}_{i=1}^N$ of $Lambda$ and pairwise extrinsic Wasserstein distances $mathbb{W}$ on $mathcal{P}_{mathrm{a.c.}}(Omega)$ only. In particular, we show that the metric space $(Lambda,mathbb{W}_{Lambda})$ can be asymptotically recovered in the sense of Gromov--Wasserstein from a graph with nodes ${lambda_i}_{i=1}^N$ and edge weights $W(lambda_i,lambda_j)$. In addition, we demonstrate how the tangent space at a sample $lambda$ can be asymptotically recovered via spectral analysis of a suitable ``covariance operator'' using optimal transport maps from $lambda$ to sufficiently close and diverse samples ${lambda_i}_{i=1}^N$. The paper closes with some explicit constructions of submanifolds $Lambda$ and numerical examples on the recovery of tangent spaces through spectral analysis.