🤖 AI Summary
In online linear programming, existing first-order online algorithms achieve only $O(sqrt{T})$ regret, significantly worse than the information-theoretic lower bound of $O(log T)$, especially under continuous decision spaces where non-degeneracy assumptions are restrictive. This work proposes a novel “learning-decision decoupling” framework, integrating regret analysis with support-set structural modeling under dual error-bound conditions. Theoretically, it achieves optimal $O(log T)$ regret for problems with finite support sets and, for the first time, attains $o(sqrt{T})$ regret for continuous support sets—thereby removing the non-degeneracy requirement. The method synergistically combines dual error-bound analysis, first-order online optimization, and stochastic approximation theory. These advances yield tighter theoretical guarantees and a scalable algorithmic paradigm for online resource allocation and revenue management.
📝 Abstract
Online linear programming plays an important role in both revenue management and resource allocation, and recent research has focused on developing efficient first-order online learning algorithms. Despite the empirical success of first-order methods, they typically achieve a regret no better than $mathcal{O} ( sqrt{T} )$, which is suboptimal compared to the $mathcal{O} (log T)$ bound guaranteed by the state-of-the-art linear programming (LP)-based online algorithms. This paper establishes a general framework that improves upon the $mathcal{O} ( sqrt{T} )$ result when the LP dual problem exhibits certain error bound conditions. For the first time, we show that first-order learning algorithms achieve $o( sqrt{T} )$ regret in the continuous support setting and $mathcal{O} (log T)$ regret in the finite support setting beyond the non-degeneracy assumption. Our results significantly improve the state-of-the-art regret results and provide new insights for sequential decision-making.