Beyond $mathcal{O}(sqrt{T})$ Regret: Decoupling Learning and Decision-making in Online Linear Programming

📅 2025-01-06

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

In online linear programming, existing first-order online algorithms achieve only $O(sqrt{T})$ regret, significantly worse than the information-theoretic lower bound of $O(log T)$, especially under continuous decision spaces where non-degeneracy assumptions are restrictive. This work proposes a novel “learning-decision decoupling” framework, integrating regret analysis with support-set structural modeling under dual error-bound conditions. Theoretically, it achieves optimal $O(log T)$ regret for problems with finite support sets and, for the first time, attains $o(sqrt{T})$ regret for continuous support sets—thereby removing the non-degeneracy requirement. The method synergistically combines dual error-bound analysis, first-order online optimization, and stochastic approximation theory. These advances yield tighter theoretical guarantees and a scalable algorithmic paradigm for online resource allocation and revenue management.

Technology Category

Application Category

📝 Abstract

Online linear programming plays an important role in both revenue management and resource allocation, and recent research has focused on developing efficient first-order online learning algorithms. Despite the empirical success of first-order methods, they typically achieve a regret no better than $mathcal{O} ( sqrt{T} )$, which is suboptimal compared to the $mathcal{O} (log T)$ bound guaranteed by the state-of-the-art linear programming (LP)-based online algorithms. This paper establishes a general framework that improves upon the $mathcal{O} ( sqrt{T} )$ result when the LP dual problem exhibits certain error bound conditions. For the first time, we show that first-order learning algorithms achieve $o( sqrt{T} )$ regret in the continuous support setting and $mathcal{O} (log T)$ regret in the finite support setting beyond the non-degeneracy assumption. Our results significantly improve the state-of-the-art regret results and provide new insights for sequential decision-making.

Problem

Research questions and friction points this paper is trying to address.

Online Learning

Linear Programming

Regret Minimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online Learning

Regret Bound

Continuous Decision Making

🔎 Similar Papers

Decoupling Learning and Decision-Making: Breaking the O(√T) Barrier in Online Resource Allocation with First-Order Methods