🤖 AI Summary
This paper addresses the computational inefficiency and high memory overhead of computing the sliced Wasserstein (SW) distance for streaming data. We propose Stream-SW, the first online-updateable SW approximation algorithm. Stream-SW integrates streaming quantile estimation with the closed-form solution of the 1-D Wasserstein distance, performing incremental aggregation along random projection directions to enable real-time SW estimation. We theoretically establish a bounded approximation error and prove that Stream-SW achieves an optimal memory complexity of O(1/ε), substantially lower than conventional sampling-based approaches. Experiments on Gaussian and Gaussian mixture streaming data demonstrate significantly improved SW approximation accuracy over baselines. Furthermore, Stream-SW is validated across diverse downstream tasks—including point cloud classification, gradient flow modeling, and streaming change-point detection—demonstrating both effectiveness and practical utility in real-world streaming scenarios.
📝 Abstract
Sliced optimal transport (SOT) or sliced Wasserstein (SW) distance is widely recognized for its statistical and computational scalability. In this work, we further enhance the computational scalability by proposing the first method for computing SW from sample streams, called emph{streaming sliced Wasserstein} (Stream-SW). To define Stream-SW, we first introduce the streaming computation of the one-dimensional Wasserstein distance. Since the one-dimensional Wasserstein (1DW) distance has a closed-form expression, given by the absolute difference between the quantile functions of the compared distributions, we leverage quantile approximation techniques for sample streams to define the streaming 1DW distance. By applying streaming 1DW to all projections, we obtain Stream-SW. The key advantage of Stream-SW is its low memory complexity while providing theoretical guarantees on the approximation error. We demonstrate that Stream-SW achieves a more accurate approximation of SW than random subsampling, with lower memory consumption, in comparing Gaussian distributions and mixtures of Gaussians from streaming samples. Additionally, we conduct experiments on point cloud classification, point cloud gradient flows, and streaming change point detection to further highlight the favorable performance of Stream-SW.