Reinforcing Competitive Multi-Agents for Playing So Long Sucker

📅 2024-11-17
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the highly complex multi-agent social game *So Long Sucker* (SLS), characterized by dynamic coalition formation, strategic deception, and player elimination—posing significant challenges for multi-agent reinforcement learning (MARL). Method: We introduce the first open-source, reproducible SLS environment—including a graphical user interface and deep reinforcement learning (DRL) benchmarking toolkit—and systematically evaluate DQN, DDQN, and Dueling DQN. To improve action feasibility, we propose rule-compliance constraints and reward shaping. Contribution/Results: Our approach achieves >95% preference for legal actions and enables agents to converge to ~50% of the theoretical maximum reward; however, convergence requires ~2,000 episodes and occasional illegal actions persist. This establishes the first DRL baseline for SLS, revealing fundamental training efficiency bottlenecks and demonstrating both the feasibility and structural limitations of classical MARL methods in dynamic coalition games.

Technology Category

Application Category

📝 Abstract
This paper examines the use of classical deep reinforcement learning (DRL) algorithms, DQN, DDQN, and Dueling DQN, in the strategy game So Long Sucker (SLS), a diplomacy-driven game defined by coalition-building and strategic betrayal. SLS poses unique challenges due to its blend of cooperative and adversarial dynamics, making it an ideal platform for studying multi-agent learning and game theory. The study's primary goal is to teach autonomous agents the game's rules and strategies using classical DRL methods. To support this effort, the authors developed a novel, publicly available implementation of SLS, featuring a graphical user interface (GUI) and benchmarking tools for DRL algorithms. Experimental results reveal that while considered basic by modern DRL standards, DQN, DDQN, and Dueling DQN agents achieved roughly 50% of the maximum possible game reward. This suggests a baseline understanding of the game's mechanics, with agents favoring legal moves over illegal ones. However, a significant limitation was the extensive training required, around 2000 games, for agents to reach peak performance, compared to human players who grasp the game within a few rounds. Even after prolonged training, agents occasionally made illegal moves, highlighting both the potential and limitations of these classical DRL methods in semi-complex, socially driven games. The findings establish a foundational benchmark for training agents in SLS and similar negotiation-based environments while underscoring the need for advanced or hybrid DRL approaches to improve learning efficiency and adaptability. Future research could incorporate game-theoretic strategies to enhance agent decision-making in dynamic multi-agent contexts.
Problem

Research questions and friction points this paper is trying to address.

Developing MARL benchmark with coalition formation and deception
Creating computational framework for strategic multi-agent gameplay
Addressing limitations of classical RL in complex negotiation dynamics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed computational framework with GUI for SLS
Applied deep reinforcement learning algorithms like DQN
Established negotiation-aware benchmark for multi-agent training
M
Medant Sharan
Dept. of Informatics, King's College London, WC2R 2LS United Kingdom
Chandranath Adak
Chandranath Adak
Indian Institute of Technology Patna
Computer VisionDeep LearningBiometricsData Analytics