Last-Iterate Convergence in Adaptive Regret Minimization for Approximate Extensive-Form Perfect Equilibrium

📅 2025-08-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In extensive-form games with imperfect information, Nash equilibria (NE) may prescribe suboptimal strategies off-equilibrium paths, failing to satisfy sequential rationality. Extended-form perfect equilibrium (EFPE) addresses this by requiring robustness to vanishing perturbations, yet existing regret-minimization algorithms lack guaranteed EFPE convergence. Method: We propose the first adaptive regret-minimization algorithm that provably converges to EFPE in its final iteration. It integrates reward-transformed counterfactual regret minimization (RT-CFR) with information-set Nash equilibrium (ISNE)-guided dynamic perturbation scheduling—replacing conventional fixed-perturbation schemes and averaging-based strategy extraction. Contribution/Results: We establish theoretical convergence to EFPE under standard assumptions. Empirically, our method outperforms state-of-the-art approaches on both NE and EFPE computation across benchmark games, achieving superior accuracy and computational efficiency without sacrificing scalability.

Technology Category

Application Category

📝 Abstract
The Nash Equilibrium (NE) assumes rational play in imperfect-information Extensive-Form Games (EFGs) but fails to ensure optimal strategies for off-equilibrium branches of the game tree, potentially leading to suboptimal outcomes in practical settings. To address this, the Extensive-Form Perfect Equilibrium (EFPE), a refinement of NE, introduces controlled perturbations to model potential player errors. However, existing EFPE-finding algorithms, which typically rely on average strategy convergence and fixed perturbations, face significant limitations: computing average strategies incurs high computational costs and approximation errors, while fixed perturbations create a trade-off between NE approximation accuracy and the convergence rate of NE refinements. To tackle these challenges, we propose an efficient adaptive regret minimization algorithm for computing approximate EFPE, achieving last-iterate convergence in two-player zero-sum EFGs. Our approach introduces Reward Transformation Counterfactual Regret Minimization (RTCFR) to solve perturbed games and defines a novel metric, the Information Set Nash Equilibrium (ISNE), to dynamically adjust perturbations. Theoretical analysis confirms convergence to EFPE, and experimental results demonstrate that our method significantly outperforms state-of-the-art algorithms in both NE and EFPE-finding tasks.
Problem

Research questions and friction points this paper is trying to address.

Ensures optimal strategies for off-equilibrium game branches
Reduces computational costs and errors in equilibrium computation
Dynamically adjusts perturbations for better convergence and accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive regret minimization for EFPE
Reward Transformation CFR (RTCFR)
Dynamic perturbation via ISNE metric
🔎 Similar Papers
No similar papers found.
H
Hang Ren
School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), China
X
Xiaozhen Sun
School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), China
T
Tianzi Ma
School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), China
Jiajia Zhang
Jiajia Zhang
Department of Epidemiology and Biostatistics, University of South Carolina
Survival AnalysisMixture Cure ModelSpatial SurvivalFrailty
X
Xuan Wang
School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), China; Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies