AID: Agent Intent from Diffusion for Multi-Agent Informative Path Planning

📅 2025-12-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of dynamic environmental belief evolution, low collaborative efficiency, and budget constraints in Multi-Agent Information Path Planning (MAIPP), this paper proposes the first decentralized path planning framework based on diffusion models. Methodologically: (1) a non-autoregressive diffusion model generates long-horizon intent trajectories, eliminating error accumulation inherent in autoregressive modeling; (2) a novel reinforcement learning algorithm—Diffusion-based Proximal Policy Optimization (DPPO)—integrates behavioral cloning with diffusion-based policy learning to enable long-horizon strategy modeling and online decentralized decision-making. The key contribution is the first application of diffusion models to MAIPP, significantly enhancing scalability and robustness. Experiments demonstrate up to a 17% improvement in information gain, a 4× speedup in execution time, and effective support for large-scale agent deployment compared to state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract
Information gathering in large-scale or time-critical scenarios (e.g., environmental monitoring, search and rescue) requires broad coverage within limited time budgets, motivating the use of multi-agent systems. These scenarios are commonly formulated as multi-agent informative path planning (MAIPP), where multiple agents must coordinate to maximize information gain while operating under budget constraints. A central challenge in MAIPP is ensuring effective coordination while the belief over the environment evolves with incoming measurements. Recent learning-based approaches address this by using distributions over future positions as"intent"to support coordination. However, these autoregressive intent predictors are computationally expensive and prone to compounding errors. Inspired by the effectiveness of diffusion models as expressive, long-horizon policies, we propose AID, a fully decentralized MAIPP framework that leverages diffusion models to generate long-term trajectories in a non-autoregressive manner. AID first performs behavior cloning on trajectories produced by existing MAIPP planners and then fine-tunes the policy using reinforcement learning via Diffusion Policy Policy Optimization (DPPO). This two-stage pipeline enables the policy to inherit expert behavior while learning improved coordination through online reward feedback. Experiments demonstrate that AID consistently improves upon the MAIPP planners it is trained from, achieving up to 4x faster execution and 17% increased information gain, while scaling effectively to larger numbers of agents. Our implementation is publicly available at https://github.com/marmotlab/AID.
Problem

Research questions and friction points this paper is trying to address.

Decentralized multi-agent coordination for informative path planning
Overcoming autoregressive intent prediction's computational cost and errors
Enhancing information gain and execution speed in large-scale scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion models generate long-term trajectories non-autoregressively
Two-stage training combines behavior cloning and reinforcement learning
Fully decentralized framework improves coordination and execution speed
🔎 Similar Papers
No similar papers found.