OpenGuanDan: A Large-Scale Imperfect Information Game Benchmark

📅 2026-01-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing large-scale multi-agent benchmarks, which often lack support for complex mixed cooperative-competitive scenarios involving imperfect information, dynamic team formation, and long-horizon decision-making. To bridge this gap, the paper introduces “GuanDan”—a Chinese four-player card game—as a standardized and open benchmark platform. The proposed framework features an independent player API architecture that supports efficient simulation, human-AI interaction, and evaluation of diverse agent types, including reinforcement learning agents, rule-based systems, and large language models. Experimental results demonstrate that current learning-based agents significantly outperform rule-based counterparts but have not yet achieved superhuman performance, thereby validating the benchmark’s effectiveness and its capacity to pose meaningful challenges for advancing multi-agent decision-making research.

Technology Category

Application Category

📝 Abstract
The advancement of data-driven artificial intelligence (AI), particularly machine learning, heavily depends on large-scale benchmarks. Despite remarkable progress across domains ranging from pattern recognition to intelligent decision-making in recent decades, exemplified by breakthroughs in board games, card games, and electronic sports games, there remains a pressing need for more challenging benchmarks to drive further research. To this end, this paper proposes OpenGuanDan, a novel benchmark that enables both efficient simulation of GuanDan (a popular four-player, multi-round Chinese card game) and comprehensive evaluation of both learning-based and rule-based GuanDan AI agents. OpenGuanDan poses a suite of nontrivial challenges, including imperfect information, large-scale information set and action spaces, a mixed learning objective involving cooperation and competition, long-horizon decision-making, variable action spaces, and dynamic team composition. These characteristics make it a demanding testbed for existing intelligent decision-making methods. Moreover, the independent API for each player allows human-AI interactions and supports integration with large language models. Empirically, we conduct two types of evaluations: (1) pairwise competitions among all GuanDan AI agents, and (2) human-AI matchups. Experimental results demonstrate that while current learning-based agents substantially outperform rule-based counterparts, they still fall short of achieving superhuman performance, underscoring the need for continued research in multi-agent intelligent decision-making domain. The project is publicly available at https://github.com/GameAI-NJUPT/OpenGuanDan.
Problem

Research questions and friction points this paper is trying to address.

imperfect information
multi-agent decision-making
large-scale benchmark
cooperation and competition
long-horizon decision-making
Innovation

Methods, ideas, or system contributions that make the work stand out.

imperfect information game
multi-agent decision-making
cooperative-competitive AI
large-scale benchmark
human-AI interaction
🔎 Similar Papers
No similar papers found.