🤖 AI Summary
In Low-Altitude Intelligent Networks (LAINs), energy-constrained UAVs face high latency and low energy efficiency due to dynamically arriving tasks and tightly coupled heterogeneous computing resources.
Method: This paper proposes a hierarchical, dual-timescale air-ground cooperative optimization framework: at the macro-timescale, a VCG auction mechanism—energy-efficiency-aware and incentive-compatible—is employed for UAV trajectory allocation; at the micro-timescale, we introduce a novel heterogeneous multi-agent PPO algorithm where a diffusion model is embedded into the Actor network, enabling observation-conditioned denoising to enhance policy diversity and environmental adaptability.
Contribution/Results: Formulated as an integer nonlinear program and jointly optimized via MARL, the framework achieves a 92.7% task success rate, significantly improves energy efficiency, and converges over 40% faster than state-of-the-art methods.
📝 Abstract
The low-altitude intelligent networks (LAINs) emerge as a promising architecture for delivering low-latency and energy-efficient edge intelligence in dynamic and infrastructure-limited environments. By integrating unmanned aerial vehicles (UAVs), aerial base stations, and terrestrial base stations, LAINs can support mission-critical applications such as disaster response, environmental monitoring, and real-time sensing. However, these systems face key challenges, including energy-constrained UAVs, stochastic task arrivals, and heterogeneous computing resources. To address these issues, we propose an integrated air-ground collaborative network and formulate a time-dependent integer nonlinear programming problem that jointly optimizes UAV trajectory planning and task offloading decisions. The problem is challenging to solve due to temporal coupling among decision variables. Therefore, we design a hierarchical learning framework with two timescales. At the large timescale, a Vickrey-Clarke-Groves auction mechanism enables the energy-aware and incentive-compatible trajectory assignment. At the small timescale, we propose the diffusion-heterogeneous-agent proximal policy optimization, a generative multi-agent reinforcement learning algorithm that embeds latent diffusion models into actor networks. Each UAV samples actions from a Gaussian prior and refines them via observation-conditioned denoising, enhancing adaptability and policy diversity. Extensive simulations show that our framework outperforms baselines in energy efficiency, task success rate, and convergence performance.