🤖 AI Summary
This paper addresses the computational inefficiency of estimating Wasserstein distances among multiple probability distributions sampled from a meta-distribution. We propose RG-Wormhole, a lightweight regression-based method that constructs tight upper and lower bounds on the Wasserstein distance using standard and enhanced sliced Wasserstein (SW) distances, respectively. These bounds serve as low-dimensional features for a linear regression model with minimal trainable parameters, solved in closed form via least squares to enable fast, high-fidelity Wasserstein estimation. RG-Wormhole requires only a small number of distribution samples for training, drastically reducing computational overhead. On multiple benchmark datasets, it outperforms the state-of-the-art Wasserstein embedding method—Wasserstein Wormhole—particularly in low-data regimes, and further accelerates its end-to-end training. RG-Wormhole thus provides an efficient, scalable alternative for large-scale distribution comparison tasks.
📝 Abstract
We address the problem of efficiently computing Wasserstein distances for multiple pairs of distributions drawn from a meta-distribution. To this end, we propose a fast estimation method based on regressing Wasserstein distance on sliced Wasserstein (SW) distances. Specifically, we leverage both standard SW distances, which provide lower bounds, and lifted SW distances, which provide upper bounds, as predictors of the true Wasserstein distance. To ensure parsimony, we introduce two linear models: an unconstrained model with a closed-form least-squares solution, and a constrained model that uses only half as many parameters. We show that accurate models can be learned from a small number of distribution pairs. Once estimated, the model can predict the Wasserstein distance for any pair of distributions via a linear combination of SW distances, making it highly efficient. Empirically, we validate our approach on diverse tasks, including Gaussian mixtures, point-cloud classification, and Wasserstein-space visualizations for 3D point clouds. Across various datasets such as MNIST point clouds, ShapeNetV2, MERFISH Cell Niches, and scRNA-seq, our method consistently provides a better approximation of Wasserstein distance than the state-of-the-art Wasserstein embedding model, Wasserstein Wormhole, particularly in low-data regimes. Finally, we demonstrate that our estimator can also accelerate Wormhole training, yielding extit{RG-Wormhole}.