Beyond $ ilde{O}(sqrt{T})$ Constraint Violation for Online Convex Optimization with Adversarial Constraints

πŸ“… 2025-05-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This paper studies online convex optimization (OCO) under adversarial constraints, aiming to jointly minimize dynamic regret and cumulative constraint violation (CCV). To overcome the fundamental $ ilde{O}(sqrt{T})$ lower bound on CCV in prior work, we propose the first tunable trade-off framework, introducing a free parameter $eta in [0,1/2]$. Under general convexity, it achieves $ ilde{O}(dT^{1-eta})$ CCV and $ ilde{O}(sqrt{dT} + T^eta)$ dynamic regret; under smoothness, it improves to $O(T^{max{1/2,eta}})$ regret and $ ilde{O}(T^{1-eta})$ CCV. Key technical innovations include adaptive small-loss analysis, constraint-expert modeling, convex-set covering reduction, and tailored gradient descent design. Our approach is the first to break the conventional $ ilde{O}(sqrt{T})$ CCV barrier, establishing a novel continuous regret–CCV trade-off paradigm.

Technology Category

Application Category

πŸ“ Abstract
We revisit the Online Convex Optimization problem with adversarial constraints (COCO) where, in each round, a learner is presented with a convex cost function and a convex constraint function, both of which may be chosen adversarially. The learner selects actions from a convex decision set in an online fashion, with the goal of minimizing both regret and the cumulative constraint violation (CCV) over a horizon of $T$ rounds. The best-known policy for this problem achieves $O(sqrt{T})$ regret and $ ilde{O}(sqrt{T})$ CCV. In this paper, we present a surprising improvement that achieves a significantly smaller CCV by trading it off with regret. Specifically, for any bounded convex cost and constraint functions, we propose an online policy that achieves $ ilde{O}(sqrt{dT}+ T^eta)$ regret and $ ilde{O}(dT^{1-eta})$ CCV, where $d$ is the dimension of the decision set and $eta in [0,1]$ is a tunable parameter. We achieve this result by first considering the special case of $ extsf{Constrained Expert}$ problem where the decision set is a probability simplex and the cost and constraint functions are linear. Leveraging a new adaptive small-loss regret bound, we propose an efficient policy for the $ extsf{Constrained Expert}$ problem, that attains $O(sqrt{Tln N}+T^{eta})$ regret and $ ilde{O}(T^{1-eta} ln N)$ CCV, where $N$ is the number of experts. The original problem is then reduced to the $ extsf{Constrained Expert}$ problem via a covering argument. Finally, with an additional smoothness assumption, we propose an efficient gradient-based policy attaining $O(T^{max(frac{1}{2},eta)})$ regret and $ ilde{O}(T^{1-eta})$ CCV.
Problem

Research questions and friction points this paper is trying to address.

Minimize regret and constraint violation in online convex optimization
Improve cumulative constraint violation beyond O(sqrt(T)) bounds
Trade off constraint violation with regret using tunable parameters
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive small-loss regret bound optimization
Tunable parameter balancing regret and CCV
Gradient-based policy with smoothness assumption
πŸ”Ž Similar Papers
No similar papers found.
A
Abhishek Sinha
School of Technology and Computer Science, Tata Institute of Fundamental Research, Mumbai 400005, India
Rahul Vaze
Rahul Vaze
Associate Professor, Electrical Engineering, Tata Institute of Fundamental Research, Mumbai, India.
Wireless CommunicationInformation TheoryStatistical Learning