Can Reinforcement Learning Solve Asymmetric Combinatorial-Continuous Zero-Sum Games?

📅 2025-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the existence and efficient computation of Nash equilibria (NE) in combinatorial-continuous heterogeneous zero-sum games—exemplified by ACCES games—where one player selects actions from an NP-hard combinatorial space while the other responds in a continuous, compact domain. We first formally define the ACCES game model. Leveraging essential finite approximation theory, we rigorously establish the existence of NE in such heterogeneous games. To compute equilibria, we propose a dual-algorithm framework: CCDO (a deterministic method) and CCDORL (a reinforcement learning–enhanced variant), both integrating a double-oracle mechanism, combinatorial optimization subroutines, and continuous gradient-based adversarial updates—built upon PPO or Actor-Critic architectures. Experiments demonstrate that our algorithms significantly outperform baselines across diverse NP-hard combinatorial optimization instances. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract
There have been extensive studies on learning in zero-sum games, focusing on the analysis of the existence and algorithmic convergence of Nash equilibrium (NE). Existing studies mainly focus on symmetric games where the strategy spaces of the players are of the same type and size. For the few studies that do consider asymmetric games, they are mostly restricted to matrix games. In this paper, we define and study a new practical class of asymmetric games called two-player Asymmetric Combinatorial-Continuous zEro-Sum (ACCES) games, featuring a combinatorial action space for one player and an infinite compact space for the other. Such ACCES games have broad implications in the real world, particularly in combinatorial optimization problems (COPs) where one player optimizes a solution in a combinatorial space, and the opponent plays against it in an infinite (continuous) compact space (e.g., a nature player deciding epistemic parameters of the environmental model). Our first key contribution is to prove the existence of NE for two-player ACCES games, using the idea of essentially finite game approximation. Building on the theoretical insights and double oracle (DO)-based solutions to complex zero-sum games, our second contribution is to design the novel algorithm, Combinatorial Continuous DO (CCDO), to solve ACCES games, and prove the convergence of the proposed algorithm. Considering the NP-hardness of most COPs and recent advancements in reinforcement learning (RL)-based solutions to COPs, our third contribution is to propose a practical algorithm to solve NE in the real world, CCDORL (based on CCDO), and provide the novel convergence analysis in the ACCES game. Experimental results across diverse instances of COPs demonstrate the empirical effectiveness of our algorithms. The code of this work is available at https://github.com/wmd3i/CCDO-RL.
Problem

Research questions and friction points this paper is trying to address.

Reinforcement Learning
Asymmetric Zero-Sum Games
Nash Equilibrium
Innovation

Methods, ideas, or system contributions that make the work stand out.

ACCES Game
CCDORL Algorithm
Nash Equilibrium
🔎 Similar Papers
No similar papers found.
Y
Yuheng Li
Department of Data Science, College of William & Mary
P
Panpan Wang
Department of Data Science, College of William & Mary
Haipeng Chen
Haipeng Chen
Assistant professor of data science, William & Mary
Reinforcement learningGenerative AIHealthAI for social good