🤖 AI Summary
This study investigates the capacity of large language models (LLMs) to master multiple complex card games with heterogeneous rules. Method: We construct high-quality gameplay supervision data, employ multi-task supervised fine-tuning, and integrate general instruction tuning data; we systematically evaluate model performance across eight structurally diverse card games. Results: (1) LLMs can concurrently acquire proficiency in multiple complex card games; (2) rule similarity significantly enhances cross-game generalization, whereas rule conflicts degrade performance; (3) incorporating general instruction data effectively mitigates the degradation of general capabilities induced by multi-task training. Our approach enables models to achieve near-expert-level game AI performance—demonstrating strong learning capability, robust cross-game transferability, and synergistic multi-task gains. This work establishes a novel paradigm for generalizable structured strategic reasoning in LLMs, advancing their applicability to rule-governed, decision-intensive domains.
📝 Abstract
Complex games have long been an important benchmark for testing the progress of artificial intelligence algorithms. AlphaGo, AlphaZero, and MuZero have defeated top human players in Go and Chess, garnering widespread societal attention towards artificial intelligence. Concurrently, large language models (LLMs) have exhibited remarkable capabilities across various tasks, raising the question of whether LLMs can achieve similar success in complex games. In this paper, we explore the potential of LLMs in mastering complex card games. We systematically assess the learning capabilities of LLMs across eight diverse card games, evaluating the impact of fine-tuning on high-quality gameplay data, and examining the models' ability to retain general capabilities while mastering these games. Our findings indicate that: (1) LLMs can approach the performance of strong game AIs through supervised fine-tuning on high-quality data, (2) LLMs can master multiple complex card games simultaneously, with performance augmentation for games with similar rules and conflicts for dissimilar ones, and (3) LLMs experience a decline in general capabilities when mastering complex games, but this decline can be mitigated by integrating a certain amount of general instruction data. The evaluation results demonstrate strong learning ability and versatility of LLMs.