🤖 AI Summary
This study addresses the challenge of insufficient stability in multi-target tracking within underwater autonomous underwater vehicle (AUV) ad hoc networks, where dynamic topologies and limited acoustic bandwidth severely constrain performance. To this end, the authors propose a scene-adaptive embodied intelligence architecture that models each AUV as an integrated entity combining perception, decision-making, and actuation. The framework features a three-layer cognitive control hierarchy coupled with a beacon-based communication model, enabling coherent coordination between high-level strategies and distributed execution. Its core innovation lies in a scene-adaptive multi-agent reinforcement learning algorithm (SA-MARL), which employs a dual-path critic mechanism and dynamic weight fusion to decouple task objectives from safety constraints, thereby facilitating autonomous policy evolution. Experimental results demonstrate that the proposed approach significantly outperforms state-of-the-art MARL methods in convergence speed, tracking accuracy, and robustness under severe interference and abrupt topological changes.
📝 Abstract
With the rapid advancement of underwater net-working and multi-agent coordination technologies, autonomous underwater vehicle (AUV) ad-hoc networks have emerged as a pivotal framework for executing complex maritime missions, such as multi-target tracking. However, traditional data-centricarchitectures struggle to maintain operational consistency under highly dynamic topological fluctuations and severely constrained acoustic communication bandwidth. This article proposes a scene-adaptive embodied intelligence (EI) architecture for multi-AUV ad-hoc networks, which re-envisions AUVs as embodied entities by integrating perception, decision-making, and physical execution into a unified cognitive loop. To materialize the functional interaction between these layers, we define a beacon-based communication and control model that treats the communication link as a dynamic constraint-aware channel, effectively bridging the gap between high-level policy inference and decentralized physical actuation. Specifically, the proposed architecture employs a three-layer functional framework and introduces a Scene-Adaptive MARL (SA-MARL) algorithm featuring a dual-path critic mechanism. By integrating a scene critic network and a general critic network through a weight-based dynamic fusion process, SA-MARL effectively decouples specialized tracking tasks from global safety constraints, facilitating autonomous policy evolution. Evaluation results demonstrate that the proposedscheme significantly accelerates policy convergence and achieves superior tracking accuracy compared to mainstream MARL approaches, maintaining robust performance even under intense environmental interference and fluid topological shifts.