🤖 AI Summary
This work addresses pathological behaviors—such as cyclic dynamics—arising in coevolutionary algorithms (CoEAs) applied to unbiased combinatorial games, where payoff nontransitivity impedes convergence. We establish the first rigorous upper bound on the expected runtime of CoEAs for such games. Our method introduces a general analytical framework integrating UMDA population dynamics modeling, probabilistic analysis, and structural characterization of game graphs to quantify the number of simulated matches required to discover optimal strategies. We derive sufficient conditions under which UMDA converges to optimal solutions in polynomial or quasi-polynomial time with high probability. The tightness and generality of our bounds are empirically validated across classical impartial games—including Nim, Chomp, Silver Dollar, and Turning Turtles. Our core contributions are: (i) the first provable runtime upper bound theory for CoEAs in combinatorial games, and (ii) a transferable analytical paradigm applicable to diverse coevolutionary settings.
📝 Abstract
Due to their complex dynamics, combinatorial games are a key test case and application for algorithms that train game playing agents. Among those algorithms that train using self-play are coevolutionary algorithms (CoEAs). CoEAs evolve a population of individuals by iteratively selecting the strongest based on their interactions against contemporaries, and using those selected as parents for the following generation (via randomised mutation and crossover). However, the successful application of CoEAs for game playing is difficult due to pathological behaviours such as cycling, an issue especially critical for games with intransitive payoff landscapes. Insight into how to design CoEAs to avoid such behaviours can be provided by runtime analysis. In this paper, we push the scope of runtime analysis to combinatorial games, proving a general upper bound for the number of simulated games needed for UMDA (a type of CoEA) to discover (with high probability) an optimal strategy for an impartial combinatorial game. This result applies to any impartial combinatorial game, and for many games the implied bound is polynomial or quasipolynomial as a function of the number of game positions. After proving the main result, we provide several applications to simple well-known games: Nim, Chomp, Silver Dollar, and Turning Turtles. As the first runtime analysis for CoEAs on combinatorial games, this result is a critical step towards a comprehensive theoretical framework for coevolution.