🤖 AI Summary
This paper addresses ad hoc teamwork under partial observability without prior coordination: a novel agent must autonomously identify the existing team’s identity and the current task in real time, using only its own observations—without access to the full environment state, teammates’ actions, or explicit communication. To this end, we propose the first fully state- and action-agnostic large-scale ad hoc collaboration framework, overcoming key limitations of prior approaches that require either small, tractable environments or full observability. Our core method employs a recurrent Bayesian classifier trained on historical interaction data, jointly leveraging sequential modeling and principled uncertainty quantification. Evaluated on a challenging benchmark domain with up to $10^6$ states and $2^{125}$ distinct observations, our approach achieves substantial improvements in both team and task identification accuracy and overall collaboration success rate.
📝 Abstract
This paper proposes RecBayes, a novel approach for ad hoc teamwork under partial observability, a setting where agents are deployed on-the-fly to environments where pre-existing teams operate, that never requires, at any stage, access to the states of the environment or the actions of its teammates. We show that by relying on a recurrent Bayesian classifier trained using past experiences, an ad hoc agent is effectively able to identify known teams and tasks being performed from observations alone. Unlike recent approaches such as PO-GPL (Gu et al., 2021) and FEAT (Rahman et al., 2023), that require at some stage fully observable states of the environment, actions of teammates, or both, or approaches such as ATPO (Ribeiro et al., 2023) that require the environments to be small enough to be tabularly modelled (Ribeiro et al., 2023), in their work up to 4.8K states and 1.7K observations, we show RecBayes is both able to handle arbitrarily large spaces while never relying on either states and teammates'actions. Our results in benchmark domains from the multi-agent systems literature, adapted for partial observability and scaled up to 1M states and 2^125 observations, show that RecBayes is effective at identifying known teams and tasks being performed from partial observations alone, and as a result, is able to assist the teams in solving the tasks effectively.