PADiff: Predictive and Adaptive Diffusion Policies for Ad Hoc Teamwork

📅 2025-11-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In ad hoc teamwork (AHT), agents face significant challenges in predicting and rapidly adapting to the behaviors of unknown, heterogeneous teammates. To address this, this paper introduces— for the first time—diffusion models into real-time team coordination, proposing a diffusion-based prediction-adaptation framework. Our method explicitly captures multimodal cooperative patterns by integrating online teammate behavior prediction into the denoising process, combined with recursive state encoding and latent-variable modeling. The resulting diffusion policy network enables diverse action generation and swift policy adaptation under dynamic environmental conditions. Evaluated on three standard collaborative benchmarks, our approach substantially outperforms existing state-of-the-art methods, demonstrating superior generalization to unseen teammates and enhanced robustness in collaborative decision-making.

Technology Category

Application Category

📝 Abstract
Ad hoc teamwork (AHT) requires agents to collaborate with previously unseen teammates, which is crucial for many real-world applications. The core challenge of AHT is to develop an ego agent that can predict and adapt to unknown teammates on the fly. Conventional RL-based approaches optimize a single expected return, which often causes policies to collapse into a single dominant behavior, thus failing to capture the multimodal cooperation patterns inherent in AHT. In this work, we introduce PADiff, a diffusion-based approach that captures agent's multimodal behaviors, unlocking its diverse cooperation modes with teammates. However, standard diffusion models lack the ability to predict and adapt in highly non-stationary AHT scenarios. To address this limitation, we propose a novel diffusion-based policy that integrates critical predictive information about teammates into the denoising process. Extensive experiments across three cooperation environments demonstrate that PADiff outperforms existing AHT methods significantly.
Problem

Research questions and friction points this paper is trying to address.

Predicting and adapting to unknown teammates in ad hoc teamwork
Overcoming single behavior collapse in conventional reinforcement learning approaches
Enabling multimodal cooperation patterns through diffusion-based policies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses diffusion models to capture multimodal behaviors
Integrates predictive teammate information into denoising
Adapts policies dynamically for ad hoc teamwork
🔎 Similar Papers
No similar papers found.
H
Hohei Chan
School of Software Engineering, South China University of Technology, Guangzhou, China
X
Xinzhi Zhang
School of Software Engineering, South China University of Technology, Guangzhou, China
A
Antao Xiang
School of Software Engineering, South China University of Technology, Guangzhou, China
W
Weinan Zhang
Shanghai Jiao Tong University, Shanghai, China
Mengchen Zhao
Mengchen Zhao
South China University of Technology
Reinforcement LearningMulti-Agent SystemsGenerative Decision MakingLLM Agents