Monte Carlo Permutation Search

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In general-sum games with constrained computational resources or where deep reinforcement learning is inapplicable, existing Monte Carlo Tree Search (MCTS) variants—such as GRAVE—rely on manually tuned bias hyperparameters for exploration, limiting robustness and generalizability. Method: This paper proposes MCPS, an enhanced MCTS algorithm that replaces GRAVE’s parameter-sensitive bias term with permutation statistics over full-path actions in the exploration component. MCPS further integrates AMAF (All Moves As First) statistics, abstract action encoding, and a three-source weighted mechanism to jointly refine value estimation and balance exploration-exploitation. Results: Empirical evaluation demonstrates that MCPS significantly outperforms GRAVE in two-player games, achieves comparable performance in multi-player settings, and exhibits strong robustness to the critical ref hyperparameter—validating its generalization capability and practical utility across diverse game-theoretic domains.

Technology Category

Application Category

📝 Abstract
We propose Monte Carlo Permutation Search (MCPS), a general-purpose Monte Carlo Tree Search (MCTS) algorithm that improves upon the GRAVE algorithm. MCPS is relevant when deep reinforcement learning is not an option, or when the computing power available before play is not substantial, such as in General Game Playing, for example. The principle of MCPS is to include in the exploration term of a node the statistics on all the playouts that contain all the moves on the path from the root to the node. We extensively test MCPS on a variety of games: board games, wargame, investment game, video game and multi-player games. MCPS has better results than GRAVE in all the two-player games. It has equivalent results for multi-player games because these games are inherently balanced even when players have different strengths. We also show that using abstract codes for moves instead of exact codes can be beneficial to both MCPS and GRAVE, as they improve the permutation statistics and the AMAF statistics. We also provide a mathematical derivation of the formulas used for weighting the three sources of statistics. These formulas are an improvement on the GRAVE formula since they no longer use the bias hyperparameter of GRAVE. Moreover, MCPS is not sensitive to the ref hyperparameter.
Problem

Research questions and friction points this paper is trying to address.

Improves GRAVE algorithm for Monte Carlo Tree Search applications
Provides alternative when deep reinforcement learning is not feasible
Enhances exploration using playout statistics from root to node
Innovation

Methods, ideas, or system contributions that make the work stand out.

MCPS improves GRAVE algorithm in Monte Carlo Tree Search
Uses playout statistics from root to node path
Eliminates bias hyperparameter and reduces sensitivity
🔎 Similar Papers
No similar papers found.