Learning Strategy Representation for Imitation Learning in Multi-Agent Games

📅 2024-09-28

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

In multi-agent imitation learning, offline datasets often contain heterogeneous policies, while existing methods rely on player identity annotations or strong prior assumptions. To address these limitations, this paper proposes STRIL—a framework that learns strategy-aware trajectory representations via contrastive learning and latent-space clustering, without requiring player identifiers or restrictive assumptions. STRIL introduces interpretable metrics—consistency and dominance—to quantitatively evaluate strategy quality, and integrates an end-to-end dynamic reweighting and pruning mechanism to identify and select dominant demonstration trajectories. The framework is plug-and-play, compatible with standard imitation learning algorithms such as Behavior Cloning (BC) and Generative Adversarial Imitation Learning (GAIL). Empirical evaluation on two-player Pong, Limit Texas Hold’em, and Connect Four demonstrates substantial performance gains over baselines; trajectory representation separability improves by over 32%, and dominant strategy trajectories are accurately identified.

Technology Category

Application Category

📝 Abstract

The offline datasets for imitation learning (IL) in multi-agent games typically contain player trajectories exhibiting diverse strategies, which necessitate measures to prevent learning algorithms from acquiring undesirable behaviors. Learning representations for these trajectories is an effective approach to depicting the strategies employed by each demonstrator. However, existing learning strategies often require player identification or rely on strong assumptions, which are not appropriate for multi-agent games. Therefore, in this paper, we introduce the Strategy Representation for Imitation Learning (STRIL) framework, which (1) effectively learns strategy representations in multi-agent games, (2) estimates proposed indicators based on these representations, and (3) filters out sub-optimal data using the indicators. STRIL is a plug-in method that can be integrated into existing IL algorithms. We demonstrate the effectiveness of STRIL across competitive multi-agent scenarios, including Two-player Pong, Limit Texas Hold'em, and Connect Four. Our approach successfully acquires strategy representations and indicators, thereby identifying dominant trajectories and significantly enhancing existing IL performance across these environments.

Problem

Research questions and friction points this paper is trying to address.

Detects diverse strategies in multi-agent game trajectories

Learns strategy representations without player identification

Filters sub-optimal data to improve imitation learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

STRIL framework

strategy representation

data filtering indicators

🔎 Similar Papers

A Survey on Self-play Methods in Reinforcement Learning