🤖 AI Summary
This work studies algorithmic optimal transport between two distributions μ and ν in high-dimensional Euclidean space ℝⁿ: given a sample x ∼ μ, how to construct, in poly(n/ε) time, a sample y approximating the optimal transport image of x under the optimal coupling. The core challenge is achieving runtime dependence only on dimension n—not on explicit representation size of the distributions. We introduce the novel notion of *sequential samplability*, and—under Gaussian measure—establish the first dimension-free computational concentration bound, resolving an open problem posed by Etesami et al. Our approach integrates the Knothe–Rosenblatt rearrangement, Talagrand’s inequality, membership-query oracles, and coordinate-wise sampling, yielding ε-approximate optimal transport under ℓₚ cost. Notably, we map standard Gaussian samples to their conditional distributions in poly(n/ε) time, achieving expected squared transport distance O(log(1/ε)), which is information-theoretically optimal, and applicable to broad classes of measurable sets S.
📝 Abstract
We study optimal transport between two high-dimensional distributions $μ,ν$ in $R^n$ from an algorithmic perspective: given $x sim μ$, find a close $y sim ν$ in $poly(n)$ time, where $n$ is the dimension of $x,y$. Thus, running time depends on the dimension rather than the full representation size of $μ,ν$. Our main result is a general algorithm for transporting any product distribution $μ$ to any $ν$ with cost $Δ+ δ$ under $ell_p^p$, where $Δ$ is the Knothe-Rosenblatt transport cost and $δ$ is a computational error decreasing with runtime. This requires $ν$ to be "sequentially samplable" with bounded average sampling cost, a new but natural notion.
We further prove:
An algorithmic version of Talagrand's inequality for transporting the standard Gaussian $Φ^n$ to arbitrary $ν$ under squared Euclidean cost. For $ν= Φ^n$ conditioned on a set $mathcal{S}$ of measure $varepsilon$, we construct the sequential sampler in expected time $poly(n/varepsilon)$ using membership oracle access to $mathcal{S}$. This yields an algorithmic transport from $Φ^n$ to $Φ^n|mathcal{S}$ in $poly(n/varepsilon)$ time and expected squared distance $O(log 1/varepsilon)$, optimal for general $mathcal{S}$ of measure $varepsilon$.
As corollary, we obtain the first computational concentration result (Etesami et al. SODA 2020) for Gaussian measure under Euclidean distance with dimension-independent transportation cost, resolving an open question of Etesami et al. Specifically, for any $mathcal{S}$ of Gaussian measure $varepsilon$, most $Φ^n$ samples can be mapped to $mathcal{S}$ within distance $O(sqrt{log 1/varepsilon})$ in $poly(n/varepsilon)$ time.