🤖 AI Summary
This work addresses the challenges of efficient real-time communication in underwater acoustic networks, which are constrained by narrow bandwidth, large propagation delays, and highly dynamic channels. The authors propose a two-layer multi-armed bandit (MAB) framework: the inner layer adaptively optimizes modulation schemes and transmission power by jointly considering channel state and Age of Information (AoI) to maximize throughput, while the outer layer dynamically adjusts feedback intervals based on throughput to reduce control overhead. This study is the first to incorporate AoI into underwater acoustic adaptive control and introduces a hierarchical cooperative MAB mechanism, comprising a context-aware, delay-handling CD-MAB algorithm and a feedback-scheduling MAB algorithm. Experimental results on the DESERT platform demonstrate up to a 20.61% improvement in throughput and a 36.60% reduction in energy consumption compared to existing deep reinforcement learning approaches.
📝 Abstract
Underwater Acoustic (UWA) networks are vital for remote sensing and ocean exploration but face inherent challenges such as limited bandwidth, long propagation delays, and highly dynamic channels. These constraints hinder real-time communication and degrade overall system performance. To address these challenges, this paper proposes a bilevel Multi-Armed Bandit (MAB) framework. At the fast inner level, a Contextual Delayed MAB (CD-MAB) jointly optimizes adaptive modulation and transmission power based on both channel state feedback and its Age of Information (AoI), thereby maximizing throughput. At the slower outer level, a Feedback Scheduling MAB dynamically adjusts the channel-state feedback interval according to throughput dynamics: stable throughput allows longer update intervals, while throughput drops trigger more frequent updates. This adaptive mechanism reduces feedback overhead and enhances responsiveness to varying network conditions. The proposed bilevel framework is computationally efficient and well-suited to resource-constrained UWA networks. Simulation results using the DESERT Underwater Network Simulator demonstrate throughput gains of up to 20.61% and energy savings of up to 36.60% compared with Deep Reinforcement Learning (DRL) baselines reported in the existing literature.