🤖 AI Summary
To address the air-ground cooperative resource allocation challenge in 6G networks—where unmanned aerial vehicles (UAVs) dynamically serve mobile ground user equipment (UEs)—this paper proposes FLARE, the first framework enabling real-time joint optimization of UAV 3D positioning, flight altitude, transmit power, and bandwidth allocation. FLARE innovatively integrates Silhouette-KMeans-based UE clustering with multi-agent reinforcement learning, employing a hybrid MADDPG-DQN policy: Silhouette-KMeans clusters UEs according to spatial distribution and channel state for efficient association, while MADDPG handles continuous power control and DQN manages discrete bandwidth block allocation. Simulation results under a 5 Mbps minimum rate constraint demonstrate that FLARE increases the number of served UEs by 73.45% compared to a pure MADDPG baseline, significantly improving resource utilization efficiency and quality of service in dynamic environments.
📝 Abstract
This letter addresses a critical challenge in the context of 6G and beyond wireless networks, the joint optimization of power and bandwidth resource allocation for aerial intelligent platforms, specifically uncrewed aerial vehicles (UAVs), operating in highly dynamic environments with mobile ground user equipment (UEs). We introduce FLARE (Flying Learning Agents for Resource Efficiency), a learning-enabled aerial intelligence framework that jointly optimizes UAV positioning, altitude, transmit power, and bandwidth allocation in real-time. To adapt to UE mobility, we employ Silhouette-based K-Means clustering, enabling dynamic grouping of users and UAVs' deployment at cluster centroids for efficient service delivery. The problem is modeled as a multi-agent control task, with bandwidth discretized into resource blocks and power treated as a continuous variable. To solve this, our proposed framework, FLARE, employs a hybrid reinforcement learning strategy that combines Multi-Agent Deep Deterministic Policy Gradient (MADDPG) and Deep Q-Network (DQN) to enhance learning efficiency. Simulation results demonstrate that our method significantly enhances user coverage, achieving a 73.45% improvement in the number of served users under a 5 Mbps data rate constraint, outperforming MADDPG baseline.