🤖 AI Summary
This work addresses the limited scalability of existing humanoid multi-agent coordination methods and the scarcity of real-world collaborative motion data. The authors propose a unified decentralized policy that leverages a Transformer architecture with teammate tokens to enable scalable coordination from local observations. By integrating masked adversarial motion priors (masked AMP) with task-oriented rewards, the approach generates realistic and diverse collaborative behaviors without requiring real multi-agent interaction data. Notably, this is the first method to support a single policy coordinating an arbitrary number of humanoid agents in human-object interaction tasks, with formation rewards independent of team size and object geometry. Experiments demonstrate high success rates in transporting various geometric objects with teams of 2 to 8 agents, exhibiting strong generalization and coordination consistency.
📝 Abstract
Physics-based humanoid control has achieved remarkable progress in enabling realistic and high-performing single-agent behaviors, yet extending these capabilities to cooperative human-object interaction (HOI) remains challenging. We present TeamHOI, a framework that enables a single decentralized policy to handle cooperative HOIs across any number of cooperating agents. Each agent operates using local observations while attending to other teammates through a Transformer-based policy network with teammate tokens, allowing scalable coordination across variable team sizes. To enforce motion realism while addressing the scarcity of cooperative HOI data, we further introduce a masked Adversarial Motion Prior (AMP) strategy that uses single-human reference motions while masking object-interacting body parts during training. The masked regions are then guided through task rewards to produce diverse and physically plausible cooperative behaviors. We evaluate TeamHOI on a challenging cooperative carrying task involving two to eight humanoid agents and varied object geometries. Finally, to promote stable carrying, we design a team-size- and shape-agnostic formation reward. TeamHOI achieves high success rates and demonstrates coherent cooperation across diverse configurations with a single policy.