Social World Model-Augmented Mechanism Design Policy Learning

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the core challenges in heterogeneous multi-agent systems—namely, difficulty in modeling individual traits (e.g., skills, preferences), weak mechanism adaptability, and low sample efficiency—this paper proposes the Hierarchical Social World Model (HSWM). HSWM jointly infers agents’ latent traits and behavioral dynamics, enabling concurrent online trait estimation and policy optimization. Integrating world modeling, hierarchical behavioral representation, and reinforcement learning, it unifies model-based and model-free mechanism design. Evaluated on taxation policy design, team coordination, and facility location tasks, HSWM achieves significant improvements: +23.6% average cumulative reward and 3.1× higher sample efficiency over baselines. To our knowledge, HSWM is the first framework to achieve an organic integration of high-fidelity trait awareness and high-sample-efficiency mechanism adaptation.

Technology Category

Application Category

📝 Abstract
Designing adaptive mechanisms to align individual and collective interests remains a central challenge in artificial social intelligence. Existing methods often struggle with modeling heterogeneous agents possessing persistent latent traits (e.g., skills, preferences) and dealing with complex multi-agent system dynamics. These challenges are compounded by the critical need for high sample efficiency due to costly real-world interactions. World Models, by learning to predict environmental dynamics, offer a promising pathway to enhance mechanism design in heterogeneous and complex systems. In this paper, we introduce a novel method named SWM-AP (Social World Model-Augmented Mechanism Design Policy Learning), which learns a social world model hierarchically modeling agents' behavior to enhance mechanism design. Specifically, the social world model infers agents' traits from their interaction trajectories and learns a trait-based model to predict agents' responses to the deployed mechanisms. The mechanism design policy collects extensive training trajectories by interacting with the social world model, while concurrently inferring agents' traits online during real-world interactions to further boost policy learning efficiency. Experiments in diverse settings (tax policy design, team coordination, and facility location) demonstrate that SWM-AP outperforms established model-based and model-free RL baselines in cumulative rewards and sample efficiency.
Problem

Research questions and friction points this paper is trying to address.

Designing adaptive mechanisms to align individual and collective interests
Modeling agents with persistent latent traits and complex system dynamics
Achieving high sample efficiency for mechanism design in multi-agent systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical social world model infers agent traits
Trait-based model predicts agent responses to mechanisms
Online trait inference boosts policy learning efficiency
🔎 Similar Papers
No similar papers found.
Xiaoyuan Zhang
Xiaoyuan Zhang
Peking University
Multi-Agent LearningReinforcement Learning
Y
Yizhe Huang
Institute for Artificial Intelligence, Peking University
Chengdong Ma
Chengdong Ma
Peking University
Reinforcement LearningMulti-Agent Systems
Z
Zhixun Chen
The Hong Kong University of Science and Technology (Guangzhou)
Long Ma
Long Ma
Dalian University of Technology
Computer VisionImage Processing
Yali Du
Yali Du
Turing Fellow, Associate professor, King's College London
Multi-Agent Reinforcement LearningHuman-ai coordinationAlignmentCooperative AI
S
Song-Chun Zhu
State Key Laboratory of General Artificial Intelligence, BIGAI, Beijing, China
Y
Yaodong Yang
Institute for Artificial Intelligence, Peking University
X
Xue Feng
State Key Laboratory of General Artificial Intelligence, BIGAI, Beijing, China