Beyond $ ilde{O}(sqrt{T})$ Constraint Violation for Online Convex Optimization with Adversarial Constraints

📅 2025-05-10

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This paper studies online convex optimization (OCO) under adversarial constraints, aiming to jointly minimize dynamic regret and cumulative constraint violation (CCV). To overcome the fundamental $ ilde{O}(sqrt{T})$ lower bound on CCV in prior work, we propose the first tunable trade-off framework, introducing a free parameter $eta in [0,1/2]$. Under general convexity, it achieves $ ilde{O}(dT^{1-eta})$ CCV and $ ilde{O}(sqrt{dT} + T^eta)$ dynamic regret; under smoothness, it improves to $O(T^{max{1/2,eta}})$ regret and $ ilde{O}(T^{1-eta})$ CCV. Key technical innovations include adaptive small-loss analysis, constraint-expert modeling, convex-set covering reduction, and tailored gradient descent design. Our approach is the first to break the conventional $ ilde{O}(sqrt{T})$ CCV barrier, establishing a novel continuous regret–CCV trade-off paradigm.

Technology Category

Application Category

📝 Abstract

We revisit the Online Convex Optimization problem with adversarial constraints (COCO) where, in each round, a learner is presented with a convex cost function and a convex constraint function, both of which may be chosen adversarially. The learner selects actions from a convex decision set in an online fashion, with the goal of minimizing both regret and the cumulative constraint violation (CCV) over a horizon of $T$ rounds. The best-known policy for this problem achieves $O(sqrt{T})$ regret and $ ilde{O}(sqrt{T})$ CCV. In this paper, we present a surprising improvement that achieves a significantly smaller CCV by trading it off with regret. Specifically, for any bounded convex cost and constraint functions, we propose an online policy that achieves $ ilde{O}(sqrt{dT}+ T^eta)$ regret and $ ilde{O}(dT^{1-eta})$ CCV, where $d$ is the dimension of the decision set and $eta in [0,1]$ is a tunable parameter. We achieve this result by first considering the special case of $ extsf{Constrained Expert}$ problem where the decision set is a probability simplex and the cost and constraint functions are linear. Leveraging a new adaptive small-loss regret bound, we propose an efficient policy for the $ extsf{Constrained Expert}$ problem, that attains $O(sqrt{Tln N}+T^{eta})$ regret and $ ilde{O}(T^{1-eta} ln N)$ CCV, where $N$ is the number of experts. The original problem is then reduced to the $ extsf{Constrained Expert}$ problem via a covering argument. Finally, with an additional smoothness assumption, we propose an efficient gradient-based policy attaining $O(T^{max(frac{1}{2},eta)})$ regret and $ ilde{O}(T^{1-eta})$ CCV.

Problem

Research questions and friction points this paper is trying to address.

Minimize regret and constraint violation in online convex optimization

Improve cumulative constraint violation beyond O(sqrt(T)) bounds

Trade off constraint violation with regret using tunable parameters

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive small-loss regret bound optimization

Tunable parameter balancing regret and CCV

Gradient-based policy with smoothness assumption

🔎 Similar Papers

No similar papers found.