🤖 AI Summary
This paper addresses the challenge of collaborative active search and tracking of dynamic, unknown multiple targets in untrusted, time-varying, and partially observable environments.
Method: We propose a multi-agent long-horizon decision-making framework that integrates heterogeneous external information. It introduces a novel time-varying weighted belief representation model, embedding LSTM-based trajectory prediction into joint optimization over temporal and configuration spaces. A distributed coordination mechanism enables asynchronous information fusion and globally utility-optimal resource allocation, combining time-varying Bayesian inference with multi-agent reinforcement decision-making.
Results: Simulations and physics-engine experiments demonstrate a 1.3–3.2× improvement in task completion speed. The framework maintains high efficiency and robustness even under extreme conditions—e.g., when the number of targets reaches five times the number of agents—significantly surpassing state-of-the-art methods in both prediction horizon and external information utilization capability.
📝 Abstract
This paper addresses the problem of both actively searching and tracking multiple unknown dynamic objects in a known environment with multiple cooperative autonomous agents with partial observability. The tracking of a target ends when the uncertainty is below a threshold. Current methods typically assume homogeneous agents without access to external information and utilize short-horizon target predictive models. Such assumptions limit real-world applications. We propose a fully integrated pipeline where the main contributions are: (1) a time-varying weighted belief representation capable of handling knowledge that changes over time, which includes external reports of varying levels of trustworthiness in addition to the agents; (2) the integration of a Long Short Term Memory-based trajectory prediction within the optimization framework for long-horizon decision-making, which reasons in time-configuration space, thus increasing responsiveness; and (3) a comprehensive system that accounts for multiple agents and enables information-driven optimization. When communication is available, our strategy consolidates exploration results collected asynchronously by agents and external sources into a headquarters, who can allocate each agent to maximize the overall team's utility, using all available information. We tested our approach extensively in simulations against baselines, and in robustness and ablation studies. In addition, we performed experiments in a 3D physics based engine robot simulator to test the applicability in the real world, as well as with real-world trajectories obtained from an oceanography computational fluid dynamics simulator. Results show the effectiveness of our method, which achieves mission completion times 1.3 to 3.2 times faster in finding all targets, even under the most challenging scenarios where the number of targets is 5 times greater than that of the agents.