🤖 AI Summary
Integrating Proof-Number Search (PNS) and Monte-Carlo Tree Search (MCTS) remains challenging in multi-player (≥3 players) games due to incompatible search semantics and the lack of a principled framework for handling non-zero-sum, continuous-scoring, and adversarial interactions.
Method: We propose Generalized Proof-Number Search with Monte-Carlo Tree Search (GPNS-MCTS), the first approach to maintain player-specific proof numbers, extending PNS’s binary proof/disproof capability to score-bounded reasoning. GPNS-MCTS combines proof-number-guided UCB selection with Score-Bounded MCTS to jointly establish win/loss conclusions and tight score bounds.
Contribution/Results: Evaluated on 11 multi-player board games, GPNS-MCTS achieves an 80% win-rate improvement over conventional PNS-MCTS hybrids in 8 games. It significantly enhances generality and performance in continuous-scoring, non-zero-sum, and multi-adversary settings, establishing a new state-of-the-art for rigorous, bounded-rational decision-making in complex multi-player games.
📝 Abstract
This paper presents Generalized Proof-Number Monte-Carlo Tree Search: a generalization of recently proposed combinations of Proof-Number Search (PNS) with Monte-Carlo Tree Search (MCTS), which use (dis)proof numbers to bias UCB1-based Selection strategies towards parts of the search that are expected to be easily (dis)proven. We propose three core modifications of prior combinations of PNS with MCTS. First, we track proof numbers per player. This reduces code complexity in the sense that we no longer need disproof numbers, and generalizes the technique to be applicable to games with more than two players. Second, we propose and extensively evaluate different methods of using proof numbers to bias the selection strategy, achieving strong performance with strategies that are simpler to implement and compute. Third, we merge our technique with Score Bounded MCTS, enabling the algorithm to prove and leverage upper and lower bounds on scores - as opposed to only proving wins or not-wins. Experiments demonstrate substantial performance increases, reaching the range of 80% for 8 out of the 11 tested board games.