🤖 AI Summary
This paper investigates the statistical performance of Sinkhorn iterations for estimating Schrödinger bridges in the two-sample setting: given finite independent samples of sizes $m$ and $n$ from the source and target distributions, respectively, it analyzes how intermediate Sinkhorn iterates affect estimation accuracy. The main contribution is the first derivation of an upper bound—$O(1/m + 1/n + r^{4k})$—on the squared total variation error of intermediate iterates, where $r < 1$ is the contraction rate and $k$ the number of iterations. This bound quantitatively characterizes the trade-off among sample size, iteration depth, and estimation bias. It unifies the statistical analysis of Sinkhorn bridges, entropically regularized optimal transport, and diffusion bridge estimators, thereby providing theoretical guidance for algorithmic parameter selection. Moreover, it establishes, from a statistical perspective, the validity and robustness of Sinkhorn bridges for estimating stochastic optimal transport paths under finite-sample settings.
📝 Abstract
The Schrödinger bridge problem seeks the optimal stochastic process that connects two given probability distributions with minimal energy modification. While the Sinkhorn algorithm is widely used to solve the static optimal transport problem, a recent work (Pooladian and Niles-Weed, 2024) proposed the Sinkhorn bridge, which estimates Schrödinger bridges by plugging optimal transport into the time-dependent drifts of SDEs, with statistical guarantees in the one-sample estimation setting where the true source distribution is fully accessible. In this work, to further justify this method, we study the statistical performance of intermediate Sinkhorn iterations in the two-sample estimation setting, where only finite samples from both source and target distributions are available. Specifically, we establish a statistical bound on the squared total variation error of Sinkhorn bridge iterations: $O(1/m+1/n + r^{4k})~(r in (0,1))$, where $m$ and $n$ are the sample sizes from the source and target distributions, respectively, and $k$ is the number of Sinkhorn iterations. This result provides a theoretical guarantee for the finite-sample performance of the Schrödinger bridge estimator and offers practical guidance for selecting sample sizes and the number of Sinkhorn iterations. Notably, our theoretical results apply to several representative methods such as [SF]$^2$M, DSBM-IMF, BM2, and LightSB(-M) under specific settings, through the previously unnoticed connection between these estimators.