Bandit Optimal Transport

📅 2025-02-11

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This paper studies sequential optimal transport (OT) learning under known marginal distributions but unknown transportation costs. It introduces the first framework—“sequential OT learning with unknown costs”—by formulating both Kantorovich and entropy-regularized OT as stochastic multi-armed bandit problems. Through an infinite-dimensional linear bandit reduction, the approach integrates Kantorovich duality theory with RKHS regularization to design an adaptive algorithm. Leveraging the intrinsic smoothness of the cost function, the algorithm automatically tunes the regularization strength without prior knowledge, achieving an optimal Õ(√T) regret bound. It unifies treatment of both classical and entropy-regularized OT, with convergence rates that naturally interpolate according to cost smoothness. This work establishes the first theoretically grounded online learning paradigm for OT.

Technology Category

Application Category

📝 Abstract

Despite the impressive progress in statistical Optimal Transport (OT) in recent years, there has been little interest in the study of the emph{sequential learning} of OT. Surprisingly so, as this problem is both practically motivated and a challenging extension of existing settings such as linear bandits. This article considers (for the first time) the stochastic bandit problem of learning to solve generic Kantorovich and entropic OT problems from repeated interactions when the marginals are known but the cost is unknown. We provide $ ilde{mathcal O}(sqrt{T})$ regret algorithms for both problems by extending linear bandits on Hilbert spaces. These results provide a reduction to infinite-dimensional linear bandits. To deal with the dimension, we provide a method to exploit the intrinsic regularity of the cost to learn, yielding corresponding regret bounds which interpolate between $ ilde{mathcal O}(sqrt{T})$ and $ ilde{mathcal O}(T)$.

Problem

Research questions and friction points this paper is trying to address.

Sequential learning of Optimal Transport

Unknown cost with known marginals

Regret algorithms for Kantorovich and entropic OT

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bandit Optimal Transport

Linear bandits Hilbert spaces

Exploit cost intrinsic regularity

🔎 Similar Papers

Modelling Global Trade with Optimal Transport