Recovering Wasserstein Distance Matrices from Few Measurements

📅 2025-09-23

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

To address the high computational cost of computing Wasserstein distances in high-dimensional data manifold learning, this paper proposes a sparse distance matrix completion method based on the Nyström approximation, which efficiently reconstructs the full Wasserstein distance matrix by sampling only a small fraction (e.g., 10%) of its rows and columns. Theoretically, we prove that the method ensures stability in multidimensional scaling (MDS), outperforming conventional matrix completion approaches. Technically, it integrates an upper-triangular sampling strategy with Nyström-based completion and is seamlessly embedded into both MDS and Isomap frameworks. Experiments on the MedMNIST OrganCMNIST dataset demonstrate that computing merely 10% of the columns preserves classification accuracy while drastically reducing computational overhead. Our key contribution is the first application of the Nyström paradigm to sparse recovery of Wasserstein distance matrices—combining rigorous theoretical guarantees with practical efficiency.

Technology Category

Application Category

📝 Abstract

This paper proposes two algorithms for estimating square Wasserstein distance matrices from a small number of entries. These matrices are used to compute manifold learning embeddings like multidimensional scaling (MDS) or Isomap, but contrary to Euclidean distance matrices, are extremely costly to compute. We analyze matrix completion from upper triangular samples and Nystr""{o}m completion in which $mathcal{O}(dlog(d))$ columns of the distance matrices are computed where $d$ is the desired embedding dimension, prove stability of MDS under Nystr""{o}m completion, and show that it can outperform matrix completion for a fixed budget of sample distances. Finally, we show that classification of the OrganCMNIST dataset from the MedMNIST benchmark is stable on data embedded from the Nystr""{o}m estimation of the distance matrix even when only 10% of the columns are computed.

Problem

Research questions and friction points this paper is trying to address.

Estimating Wasserstein distance matrices from limited entries efficiently

Reducing computational cost of manifold learning embeddings like MDS

Stable classification using Nyström completion with few matrix columns

Innovation

Methods, ideas, or system contributions that make the work stand out.

Estimating Wasserstein matrices from few entries

Using Nyström completion with O(d log d) columns

Proving MDS stability under Nyström matrix completion

🔎 Similar Papers

Private Wasserstein Distance