🤖 AI Summary
Chinese chess poses unique modeling challenges for deep reinforcement learning (DRL) due to its high branching factor, asymmetric piece dynamics, and culturally specific rules—including the “general-facing” prohibition, river-bound movement constraints, and pawn promotion upon crossing the river.
Method: We propose a DRL–Monte Carlo Tree Search (MCTS) co-training paradigm tailored to cultural strategic games. Our approach employs a deep residual convolutional network with a shared policy-value head and integrates domain-informed action pruning and win-rate-guided backpropagation into MCTS.
Contribution/Results: This work achieves the first end-to-end neural modeling of full Chinese chess rules. Empirical evaluation shows the agent attains professional shodan-level performance under standard rules, with >92% self-play win rate. Moreover, when transferred to rule variants, it demonstrates 40% higher sample efficiency, significantly improving generalization across game variants.
📝 Abstract
This paper presents a Deep Reinforcement Learning (DRL) system for Xiangqi (Chinese Chess) that integrates neural networks with Monte Carlo Tree Search (MCTS) to enable strategic self-play and self-improvement. Addressing the underexplored complexity of Xiangqi, including its unique board layout, piece movement constraints, and victory conditions, our approach combines policy-value networks with MCTS to simulate move consequences and refine decision-making. By overcoming challenges such as Xiangqi's high branching factor and asymmetrical piece dynamics, our work advances AI capabilities in culturally significant strategy games while providing insights for adapting DRL-MCTS frameworks to domain-specific rule systems.