Manifold learning in Wasserstein space

📅 2023-11-14

🏛️ arXiv.org

📈 Citations: 4

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This work addresses manifold learning for the space of absolutely continuous probability measures $mathcal{P}_{mathrm{a.c.}}(Omega)$—endowed with the Wasserstein-2 metric—where $Omega subset mathbb{R}^d$ is compact and convex. We propose the first extrinsic, distance-based implicit manifold modeling framework grounded solely in Wasserstein distances. Methodologically, we construct locally linearizable non-flat Wasserstein submanifolds and estimate tangent spaces via spectral analysis of the covariance operator associated with optimal transport maps, using only pairwise Wasserstein distances between samples. Theoretically, we prove that, as sample density tends to infinity, the distance graph asymptotically recovers the intrinsic metric structure of the manifold. Empirically, tangent spaces are reconstructed with high accuracy. Our key contribution lies in transcending Euclidean assumptions: we establish the first rigorous theoretical and algorithmic foundation for distance-driven manifold learning directly in the Wasserstein space.

📝 Abstract

This paper aims at building the theoretical foundations for manifold learning algorithms in the space of absolutely continuous probability measures $mathcal{P}_{mathrm{a.c.}}(Omega)$ with $Omega$ a compact and convex subset of $mathbb{R}^d$, metrized with the Wasserstein-2 distance $mathbb{W}$. We begin by introducing a construction of submanifolds $Lambda$ in $mathcal{P}_{mathrm{a.c.}}(Omega)$ equipped with metric $mathbb{W}_Lambda$, the geodesic restriction of $mathbb{W}$ to $Lambda$. In contrast to other constructions, these submanifolds are not necessarily flat, but still allow for local linearizations in a similar fashion to Riemannian submanifolds of $mathbb{R}^d$. We then show how the latent manifold structure of $(Lambda,mathbb{W}_{Lambda})$ can be learned from samples ${lambda_i}_{i=1}^N$ of $Lambda$ and pairwise extrinsic Wasserstein distances $mathbb{W}$ on $mathcal{P}_{mathrm{a.c.}}(Omega)$ only. In particular, we show that the metric space $(Lambda,mathbb{W}_{Lambda})$ can be asymptotically recovered in the sense of Gromov--Wasserstein from a graph with nodes ${lambda_i}_{i=1}^N$ and edge weights $W(lambda_i,lambda_j)$. In addition, we demonstrate how the tangent space at a sample $lambda$ can be asymptotically recovered via spectral analysis of a suitable ``covariance operator'' using optimal transport maps from $lambda$ to sufficiently close and diverse samples ${lambda_i}_{i=1}^N$. The paper closes with some explicit constructions of submanifolds $Lambda$ and numerical examples on the recovery of tangent spaces through spectral analysis.

Problem

Research questions and friction points this paper is trying to address.

Theoretical foundations for manifold learning in Wasserstein space

Learning latent manifold structure from Wasserstein distances

Recovering tangent spaces via spectral analysis of transport maps

Innovation

Methods, ideas, or system contributions that make the work stand out.

Constructs submanifolds in Wasserstein space

Learns manifold structure from Wasserstein distances

Recovers tangent spaces via spectral analysis

🔎 Similar Papers

Towards One Model for Classical Dimensionality Reduction: A Probabilistic Perspective on UMAP and t-SNE