On the Exploitability of FTRL Dynamics

📅 2026-04-06

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This study investigates the exploitability of Follow-the-Regularized-Leader (FTRL) learners with fixed step sizes in two-player zero-sum games when facing oracle optimizers. By integrating game theory, online learning theory, and stochastic game models, and distinguishing between fixed and alternating optimizer settings, the work establishes— for the first time—that exploitability is an inherent property of the FTRL family. The core contributions include a geometric dichotomy based on the steepness of the regularizer, a sensitivity metric quantifying vulnerability to strategic manipulation, and theoretical guarantees showing a lower bound of Ω(N/η) on exploitability under fixed optimizers, as well as a high-probability surplus of Ω(ηT/poly(n,m)) in cumulative payoff under alternating optimizers.

Technology Category

Application Category

📝 Abstract

In this paper we investigate the exploitability of a Follow-the-Regularized-Leader (FTRL) learner with constant step size $η$ in $n\times m$ two-player zero-sum games played over $T$ rounds against a clairvoyant optimizer. In contrast with prior analysis, we show that exploitability is an inherent feature of the FTRL family, rather than an artifact of specific instantiations. First, for fixed optimizer, we establish a sweeping law of order $Ω(N/η)$, proving that exploitation scales to the number of the learner's suboptimal actions $N$ and vanishes in their absence. Second, for alternating optimizer, a surplus of $Ω(ηT/\mathrm{poly}(n,m))$ can be guaranteed regardless of the equilibrium structure, with high probability, in random games. Our analysis uncovers once more the sharp geometric dichotomy: non-steep regularizers allow the optimizer to extract maximum surplus via finite-time elimination of suboptimal actions, whereas steep ones introduce a vanishing correction that may delay exploitation. Finally, we discuss whether this leverage persists under bilateral payoff uncertainty and we propose susceptibility measure to quantify which regularizers are most vulnerable to strategic manipulation.

Problem

Research questions and friction points this paper is trying to address.

exploitability

FTRL dynamics

zero-sum games

strategic manipulation

regularizers

Innovation

Methods, ideas, or system contributions that make the work stand out.

exploitability

FTRL dynamics

steep regularizers