🤖 AI Summary
Online competitive information gathering in partially observable stochastic games (POSGs) with continuous state spaces remains challenging due to reliance on predefined belief hierarchies and computationally prohibitive offline planning.
Method: This paper introduces the finite-horizon POSG paradigm, eliminating dependence on belief-state assumptions and large-scale offline computation. Our approach integrates particle-filter-based joint state estimation, stochastic-gradient-based game-theoretic optimization, online trajectory-space planning, and distributed single-agent deployment—supporting arbitrary numbers of agents and complex environments with visual or physical occlusions.
Results: Experiments on continuous pursuit-evasion and warehouse picking tasks demonstrate substantial improvements over passive strategies. To our knowledge, this is the first framework enabling efficient, robust active information acquisition and rational trajectory coordination in continuous POSGs. It establishes a scalable pathway for online perception-decision coupling in multi-agent systems.
📝 Abstract
Game-theoretic agents must make plans that optimally gather information about their opponents. These problems are modeled by partially observable stochastic games (POSGs), but planning in fully continuous POSGs is intractable without heavy offline computation or assumptions on the order of belief maintained by each player. We formulate a finite history/horizon refinement of POSGs which admits competitive information gathering behavior in trajectory space, and through a series of approximations, we present an online method for computing rational trajectory plans in these games which leverages particle-based estimations of the joint state space and performs stochastic gradient play. We also provide the necessary adjustments required to deploy this method on individual agents. The method is tested in continuous pursuit-evasion and warehouse-pickup scenarios (alongside extensions to $N>2$ players and to more complex environments with visual and physical obstacles), demonstrating evidence of active information gathering and outperforming passive competitors.