🤖 AI Summary
This paper addresses the problem of computing a (1+ε)-approximation to the Earth Mover’s Distance (EMD) in high-dimensional spaces. The proposed method introduces the first framework that reduces EMD approximation to the Closest Pair (CP) problem, and implicitly implements the Multiplicative Weights Update (MWU) algorithm via geometric structure—thereby avoiding explicit weight storage and updates, and significantly reducing both space and time overhead. A key technical innovation leverages the structural properties of high-dimensional point sets under ℓₚ metrics to achieve an EMD approximation in n^{2−Õ(ε^{1/3})} time. This breaks the previous best-known time bound of n^{2−Ω(ε²)} and establishes the fastest subquadratic algorithm for EMD approximation to date.
📝 Abstract
We give a reduction from $(1+varepsilon)$-approximate Earth Mover's Distance (EMD) to $(1+varepsilon)$-approximate Closest Pair (CP). As a consequence, we improve the fastest known approximation algorithm for high-dimensional EMD. Here, given $pin [1, 2]$ and two sets of $n$ points $X,Y subseteq (mathbb R^d,ell_p)$, their EMD is the minimum cost of a perfect matching between $X$ and $Y$, where the cost of matching two vectors is their $ell_p$ distance. Further, CP is the basic problem of finding a pair of points realizing $min_{x in X, yin Y} ||x-y||_p$. Our contribution is twofold: we show that if a $(1+varepsilon)$-approximate CP can be computed in time $n^{2-φ}$, then a $1+O(varepsilon)$ approximation to EMD can be computed in time $n^{2-Ω(φ)}$; plugging in the fastest known algorithm for CP [Alman, Chan, Williams FOCS'16], we obtain a $(1+varepsilon)$-approximation algorithm for EMD running in time $n^{2- ildeΩ(varepsilon^{1/3})}$ for high-dimensional point sets, which improves over the prior fastest running time of $n^{2-Ω(varepsilon^2)}$ [Andoni, Zhang FOCS'23]. Our main technical contribution is a sublinear implementation of the Multiplicative Weights Update framework for EMD. Specifically, we demonstrate that the updates can be executed without ever explicitly computing or storing the weights; instead, we exploit the underlying geometric structure to perform the updates implicitly.