Long-Term Mapping of the Douro River Plume with Multi-Agent Reinforcement Learning

📅 2025-10-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of long-term (multi-day) monitoring of the Douro River plume. We propose a multi-AUV cooperative framework featuring: (i) an intermittent communication protocol orchestrated by a central coordinator to reduce energy consumption and communication overhead; (ii) spatiotemporal Gaussian Process Regression (GPR) to model the plume’s dynamic evolution; and (iii) a multi-head Q-network-based multi-agent reinforcement learning (MARL) policy for adaptive navigation and task allocation. Our key innovation is the tight coupling of GPR-based environmental modeling and MARL-based decision-making into a closed-loop modeling-decision architecture, enabling cross-seasonal policy generalization. Evaluated via high-fidelity simulations driven by the Delft3D ocean model, the framework achieves over 100% improvement in operational endurance compared to single- and multi-agent baselines, while simultaneously enhancing monitoring accuracy. Moreover, accuracy remains stable with increasing AUV count, demonstrating scalability and robustness for long-term, large-scale marine environmental monitoring.

Technology Category

Application Category

📝 Abstract
We study the problem of long-term (multiple days) mapping of a river plume using multiple autonomous underwater vehicles (AUVs), focusing on the Douro river representative use-case. We propose an energy - and communication - efficient multi-agent reinforcement learning approach in which a central coordinator intermittently communicates with the AUVs, collecting measurements and issuing commands. Our approach integrates spatiotemporal Gaussian process regression (GPR) with a multi-head Q-network controller that regulates direction and speed for each AUV. Simulations using the Delft3D ocean model demonstrate that our method consistently outperforms both single- and multi-agent benchmarks, with scaling the number of agents both improving mean squared error (MSE) and operational endurance. In some instances, our algorithm demonstrates that doubling the number of AUVs can more than double endurance while maintaining or improving accuracy, underscoring the benefits of multi-agent coordination. Our learned policies generalize across unseen seasonal regimes over different months and years, demonstrating promise for future developments of data-driven long-term monitoring of dynamic plume environments.
Problem

Research questions and friction points this paper is trying to address.

Long-term mapping of river plumes using multiple autonomous underwater vehicles
Energy-efficient multi-agent reinforcement learning for underwater coordination
Generalizing monitoring policies across seasonal changes in dynamic environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent reinforcement learning coordinates AUVs efficiently
Spatiotemporal Gaussian process regression integrates with Q-network control
Method scales agent count to boost accuracy and endurance
🔎 Similar Papers
No similar papers found.
N
Nicolò Dal Fabbro
Department of Electrical and Systems Engineering, University of Pennsylvania, USA
M
Milad Mesbahi
Department of Electrical and Systems Engineering, University of Pennsylvania, USA
R
Renato Mendes
Laboratório de Sistemas e Tecnologia Subaquática (LSTS), Faculdade de Engenharia da Universidade do Porto, Portugal
João Borges de Sousa
João Borges de Sousa
Unknown affiliation
G
George J. Pappas
Department of Electrical and Systems Engineering, University of Pennsylvania, USA