Last-Iterate Convergence in Adaptive Regret Minimization for Approximate Extensive-Form Perfect Equilibrium

📅 2025-08-11

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

In extensive-form games with imperfect information, Nash equilibria (NE) may prescribe suboptimal strategies off-equilibrium paths, failing to satisfy sequential rationality. Extended-form perfect equilibrium (EFPE) addresses this by requiring robustness to vanishing perturbations, yet existing regret-minimization algorithms lack guaranteed EFPE convergence. Method: We propose the first adaptive regret-minimization algorithm that provably converges to EFPE in its final iteration. It integrates reward-transformed counterfactual regret minimization (RT-CFR) with information-set Nash equilibrium (ISNE)-guided dynamic perturbation scheduling—replacing conventional fixed-perturbation schemes and averaging-based strategy extraction. Contribution/Results: We establish theoretical convergence to EFPE under standard assumptions. Empirically, our method outperforms state-of-the-art approaches on both NE and EFPE computation across benchmark games, achieving superior accuracy and computational efficiency without sacrificing scalability.

Technology Category

Application Category

📝 Abstract

The Nash Equilibrium (NE) assumes rational play in imperfect-information Extensive-Form Games (EFGs) but fails to ensure optimal strategies for off-equilibrium branches of the game tree, potentially leading to suboptimal outcomes in practical settings. To address this, the Extensive-Form Perfect Equilibrium (EFPE), a refinement of NE, introduces controlled perturbations to model potential player errors. However, existing EFPE-finding algorithms, which typically rely on average strategy convergence and fixed perturbations, face significant limitations: computing average strategies incurs high computational costs and approximation errors, while fixed perturbations create a trade-off between NE approximation accuracy and the convergence rate of NE refinements. To tackle these challenges, we propose an efficient adaptive regret minimization algorithm for computing approximate EFPE, achieving last-iterate convergence in two-player zero-sum EFGs. Our approach introduces Reward Transformation Counterfactual Regret Minimization (RTCFR) to solve perturbed games and defines a novel metric, the Information Set Nash Equilibrium (ISNE), to dynamically adjust perturbations. Theoretical analysis confirms convergence to EFPE, and experimental results demonstrate that our method significantly outperforms state-of-the-art algorithms in both NE and EFPE-finding tasks.

Problem

Research questions and friction points this paper is trying to address.

Ensures optimal strategies for off-equilibrium game branches

Reduces computational costs and errors in equilibrium computation

Dynamically adjusts perturbations for better convergence and accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive regret minimization for EFPE

Reward Transformation CFR (RTCFR)

Dynamic perturbation via ISNE metric

🔎 Similar Papers

A Policy-Gradient Approach to Solving Imperfect-Information Games with Iterate Convergence

2024-08-01arXiv.orgCitations: 3

Bosch Group

Renningen, BW, DE

AI Research Scientist - FAIR Social Intelligence