🤖 AI Summary
To address the challenge of preserving complex structural relationships in high-dimensional data reduction, this paper proposes a novel probabilistic measure embedding method based on the Gromov–Wasserstein (GW) distance. It models both the original high-dimensional data and its low-dimensional representation as probability measures, quantifying their structural discrepancy via the GW distance, and optimizes the embedding through gradient descent. This work is the first to systematically integrate the GW distance into dimensionality reduction, unifying multidimensional scaling (MDS) and Isomap under a probabilistic measure-theoretic framework. By replacing conventional Euclidean or geodesic distances, it eliminates reliance on predefined metric structures and significantly improves fidelity for multiscale and nonlinear manifold structures. Experiments demonstrate that the method yields more robust and interpretable low-dimensional embeddings on complex datasets, achieving superior structural preservation and cross-domain generalization compared to classical MDS and Isomap.
📝 Abstract
Analyzing relationships between objects is a pivotal problem within data science. In this context, Dimensionality reduction (DR) techniques are employed to generate smaller and more manageable data representations. This paper proposes a new method for dimensionality reduction, based on optimal transportation theory and the Gromov-Wasserstein distance. We offer a new probabilistic view of the classical Multidimensional Scaling (MDS) algorithm and the nonlinear dimensionality reduction algorithm, Isomap (Isometric Mapping or Isometric Feature Mapping) that extends the classical MDS, in which we use the Gromov-Wasserstein distance between the probability measure of high-dimensional data, and its low-dimensional representation. Through gradient descent, our method embeds high-dimensional data into a lower-dimensional space, providing a robust and efficient solution for analyzing complex high-dimensional datasets.