Faster Rates for No-Regret Learning in General Games via Cautious Optimism

📅 2025-03-31

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This paper studies no-regret learning in multi-player general-sum games, aiming to minimize cumulative regret for each player. We propose a decoupled online algorithm that integrates optimistic multiplicative weights update (OMWU) with an adaptive non-monotonic learning rate scheme, incorporating a cautious optimism strategy to dynamically adjust the learning pace. Theoretically, the algorithm achieves a per-player regret upper bound of $O(n log^2 d log T)$, where $n$ is the number of players, $d$ the number of actions per player, and $T$ the number of rounds. This improves the dependence on action dimension from exponential to $log^2 d$ and reduces the time dependence from $log^4 T$ to $log T$, compared to prior methods. To our knowledge, this is the tightest per-player regret bound established for multi-player general-sum games, significantly outperforming baselines such as Log-Regularized Lifted Optimistic FTRL and Optimistic Hedge.

Technology Category

Application Category

📝 Abstract

We establish the first uncoupled learning algorithm that attains $O(n log^2 d log T)$ per-player regret in multi-player general-sum games, where $n$ is the number of players, $d$ is the number of actions available to each player, and $T$ is the number of repetitions of the game. Our results exponentially improve the dependence on $d$ compared to the $O(n, d log T)$ regret attainable by Log-Regularized Lifted Optimistic FTRL [Far+22c], and also reduce the dependence on the number of iterations $T$ from $log^4 T$ to $log T$ compared to Optimistic Hedge, the previously well-studied algorithm with $O(n log d log^4 T)$ regret [DFG21]. Our algorithm is obtained by combining the classic Optimistic Multiplicative Weights Update (OMWU) with an adaptive, non-monotonic learning rate that paces the learning process of the players, making them more cautious when their regret becomes too negative.

Problem

Research questions and friction points this paper is trying to address.

Develop uncoupled learning for multi-player general-sum games

Improve regret dependence on actions and iterations

Combine OMWU with adaptive learning rates

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines OMWU with adaptive learning rate

Non-monotonic learning rate for caution

Exponentially improves regret dependence on d

🔎 Similar Papers

No similar papers found.