🤖 AI Summary
In extensive-form games with imperfect information, Nash equilibria (NE) may prescribe suboptimal strategies off-equilibrium paths, failing to satisfy sequential rationality. Extended-form perfect equilibrium (EFPE) addresses this by requiring robustness to vanishing perturbations, yet existing regret-minimization algorithms lack guaranteed EFPE convergence.
Method: We propose the first adaptive regret-minimization algorithm that provably converges to EFPE in its final iteration. It integrates reward-transformed counterfactual regret minimization (RT-CFR) with information-set Nash equilibrium (ISNE)-guided dynamic perturbation scheduling—replacing conventional fixed-perturbation schemes and averaging-based strategy extraction.
Contribution/Results: We establish theoretical convergence to EFPE under standard assumptions. Empirically, our method outperforms state-of-the-art approaches on both NE and EFPE computation across benchmark games, achieving superior accuracy and computational efficiency without sacrificing scalability.
📝 Abstract
The Nash Equilibrium (NE) assumes rational play in imperfect-information Extensive-Form Games (EFGs) but fails to ensure optimal strategies for off-equilibrium branches of the game tree, potentially leading to suboptimal outcomes in practical settings. To address this, the Extensive-Form Perfect Equilibrium (EFPE), a refinement of NE, introduces controlled perturbations to model potential player errors. However, existing EFPE-finding algorithms, which typically rely on average strategy convergence and fixed perturbations, face significant limitations: computing average strategies incurs high computational costs and approximation errors, while fixed perturbations create a trade-off between NE approximation accuracy and the convergence rate of NE refinements.
To tackle these challenges, we propose an efficient adaptive regret minimization algorithm for computing approximate EFPE, achieving last-iterate convergence in two-player zero-sum EFGs. Our approach introduces Reward Transformation Counterfactual Regret Minimization (RTCFR) to solve perturbed games and defines a novel metric, the Information Set Nash Equilibrium (ISNE), to dynamically adjust perturbations. Theoretical analysis confirms convergence to EFPE, and experimental results demonstrate that our method significantly outperforms state-of-the-art algorithms in both NE and EFPE-finding tasks.