The Power of Perturbation under Sampling in Solving Extensive-Form Games

📅 2025-01-28

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

In incomplete-information extensive-form games, the Follow-the-Regularized-Leader (FTRL) algorithm suffers from unstable convergence and unreliable approximate equilibria under sampling-based settings. To address this, we propose a perturbed FTRL general framework: (i) we design a novel variance-reducing divergence function to lower the variance of perturbed payoff estimates; (ii) we achieve, for the first time, strong robustness in last-iterate convergence to Nash equilibrium; and (iii) we provide theoretical guarantees on both attainability and convergence of approximate equilibria under sampling. Experiments on Leduc poker and multiple benchmark games demonstrate that our method significantly outperforms state-of-the-art approaches—exhibiting smoother, more stable convergence and superior robustness against sampling noise.

Technology Category

Application Category

📝 Abstract

This paper investigates how perturbation does and does not improve the Follow-the-Regularized-Leader (FTRL) algorithm in imperfect-information extensive-form games. Perturbing the expected payoffs guarantees that the FTRL dynamics reach an approximate equilibrium, and proper adjustments of the magnitude of the perturbation lead to a Nash equilibrium ( extit{last-iterate convergence}). This approach is robust even when payoffs are estimated using sampling -- as is the case for large games -- while the optimistic approach often becomes unstable. Building upon those insights, we first develop a general framework for perturbed FTRL algorithms under extit{sampling}. We then empirically show that in the last-iterate sense, the perturbed FTRL consistently outperforms the non-perturbed FTRL. We further identify a divergence function that reduces the variance of the estimates for perturbed payoffs, with which it significantly outperforms the prior algorithms on Leduc poker (whose structure is more asymmetric in a sense than that of the other benchmark games) and consistently performs smooth convergence behavior on all the benchmark games.

Problem

Research questions and friction points this paper is trying to address.

FTRL Optimization

Complex Imperfect Information Games

Nash Equilibrium Strategies

Innovation

Methods, ideas, or system contributions that make the work stand out.

FTRL Algorithm

Perturbation Technique

Variance Reduction

🔎 Similar Papers

Boosting Perturbed Gradient Ascent for Last-Iterate Convergence in Games