🤖 AI Summary
This work systematically analyzes the theoretical performance of inexact online mirror descent (OMD), focusing on how approximation errors in solving the proximal subproblem affect regret bounds. Leveraging tools from convex analysis, online learning theory, and uniform smoothness/barrier properties, we derive the first tight, error-dependent excess regret bounds. Key results: (i) Negative entropy regularization over the simplex recovers robustness only under stochastic losses; even then, its restriction to subsets requires exponentially small approximation errors to avoid linear regret. (ii) In contrast, logarithmic barrier and Tsallis entropies sustain sublinear regret under polynomially bounded approximation errors—demonstrating superior error tolerance. These findings uncover a fundamental trade-off between regularizer smoothness and approximation resilience, establishing precise conditions under which inexact OMD remains viable. The analysis provides both theoretical foundations and practical guidance for regularizer selection in real-world OMD deployments.
📝 Abstract
Online mirror descent (OMD) is a fundamental algorithmic paradigm that underlies many algorithms in optimization, machine learning and sequential decision-making. The OMD iterates are defined as solutions to optimization subproblems which, oftentimes, can be solved only approximately, leading to an inexact version of the algorithm. Nonetheless, existing OMD analyses typically assume an idealized error free setting, thereby limiting our understanding of performance guarantees that should be expected in practice. In this work we initiate a systematic study into inexact OMD, and uncover an intricate relation between regularizer smoothness and robustness to approximation errors. When the regularizer is uniformly smooth, we establish a tight bound on the excess regret due to errors. Then, for barrier regularizers over the simplex and its subsets, we identify a sharp separation: negative entropy requires exponentially small errors to avoid linear regret, whereas log-barrier and Tsallis regularizers remain robust even when the errors are only polynomial. Finally, we show that when the losses are stochastic and the domain is the simplex, negative entropy regains robustness-but this property does not extend to all subsets, where exponentially small errors are again necessary to avoid suboptimal regret.