Polyhedral Instability Governs Regret in Online Learning

📅 2026-05-13

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

This study addresses online convex optimization with piecewise-linear objectives, focusing on how the polyhedral structure induced by convex relaxation influences regret. It introduces, for the first time, the notion of “polyhedral instability” to characterize the number of switches between active regions and demonstrates that this quantity governs the regret behavior. By leveraging convex relaxation, the Lovász extension, and combinatorial structure modeling, the work establishes a precise relationship between regret and instability, yielding a tight bound of Θ(√((1+R S_T)T log V_max)). This framework unifies expert-level and dimension-dependent regret bounds and extends naturally to online submodular–concave games. Empirical validation on shortest-path and influence maximization tasks confirms the prevalence of low instability without requiring explicit enumeration of the action space.

📝 Abstract

Many online decision problems over combinatorial actions are addressed via convex relaxations, leading to online convex optimization with piecewise linear objectives and induced polyhedral structure. We show that regret in such problems is governed by \emph{polyhedral instability}: the number of changes of the active region. Under full information feedback and fixed partition assumptions, if $\mathrm{RS}_T$ denotes the number of region switches and $V_{\max}$ the maximum number of vertices per region, we prove $\Regret_T= Θ(\sqrt{(1+\mathrm{RS}_T)\,T\,\log V_{\max}})$ interpolating between experts-like and dimension-dependent OCO rates. For online submodular--concave games under Lovász convexification, this reduces to the permutation-switch count $\mathrm{SC}_T$, yielding the matching rate $\Regret_T= Θ(\sqrt{(1+\mathrm{SC}_T)\,T\,\log n})$. Experiments on synthetic and real combinatorial problems (shortest path, influence maximization) validate the predicted scaling and indicate that low-instability regimes can arise in practice without explicit enumeration of actions.

Problem

Research questions and friction points this paper is trying to address.

polyhedral instability

online learning

regret

combinatorial actions

region switches

Innovation

Methods, ideas, or system contributions that make the work stand out.

polyhedral instability

region switches

online convex optimization