Multi-Objective $ extit{min-max}$ Online Convex Optimization

📅 2025-10-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies multi-objective online convex optimization (MOCO): given $K$ independent loss function sequences, the algorithm must select a single action per round without observing the current losses of any sequence. Departing from standard single-objective settings, we propose a min-max regret criterion—measuring the worst-case performance gap between the online policy and the static optimal action that minimizes the maximum total loss across all $K$ sequences. We design an efficient algorithm by innovatively integrating the Hedge algorithm with online gradient descent (OGD), under an i.i.d. input assumption. We prove that its expected min-max regret is bounded by $O(sqrt{T log K})$, achieving for the first time a logarithmic dependence on the number of objectives $K$. This bound matches the fundamental lower bound for this setting, establishing optimal convergence rate.

Technology Category

Application Category

📝 Abstract
In online convex optimization (OCO), a single loss function sequence is revealed over a time horizon of $T$, and an online algorithm has to choose its action at time $t$, before the loss function at time $t$ is revealed. The goal of the online algorithm is to incur minimal penalty (called $ extit{regret}$ compared to a static optimal action made by an optimal offline algorithm knowing all functions of the sequence in advance. In this paper, we broaden the horizon of OCO, and consider multi-objective OCO, where there are $K$ distinct loss function sequences, and an algorithm has to choose its action at time $t$, before the $K$ loss functions at time $t$ are revealed. To capture the tradeoff between tracking the $K$ different sequences, we consider the $ extit{min-max}$ regret, where the benchmark (optimal offline algorithm) takes a static action across all time slots that minimizes the maximum of the total loss (summed across time slots) incurred by each of the $K$ sequences. An online algorithm is allowed to change its action across time slots, and its {it min-max} regret is defined as the difference between its $ extit{min-max}$ cost and that of the benchmark. The $ extit{min-max}$ regret is a stringent performance measure and an algorithm with small regret needs to `track' all loss function sequences closely at all times. We consider this $ extit{min-max}$ regret in the i.i.d. input setting where all loss functions are i.i.d. generated from an unknown distribution. For the i.i.d. model we propose a simple algorithm that combines the well-known $ extit{Hedge}$ and online gradient descent (OGD) and show via a remarkably simple proof that its expected $ extit{min-max}$ regret is $O(sqrt{T log K})$.
Problem

Research questions and friction points this paper is trying to address.

Extending online convex optimization to multiple objective functions
Minimizing worst-case regret across multiple loss sequences
Balancing trade-offs between tracking different loss functions simultaneously
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines Hedge and online gradient descent
Addresses multi-objective min-max regret
Handles i.i.d. loss functions from distributions
🔎 Similar Papers
No similar papers found.