Energy Efficient Trajectory Control and Resource Allocation in Multi-UAV-assisted MEC via Deep Reinforcement Learning

📅 2025-07-31

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

In multi-UAV-assisted mobile edge computing (MEC) systems, jointly optimizing task offloading volume, latency, and energy consumption remains challenging. To address this, we propose TCRAMOP—a multi-objective framework for joint optimization of UAV trajectories and computational resource allocation. We further design DPPOIL, a distributed proximal policy optimization algorithm enhanced with generative adversarial imitation learning, to improve policy convergence and generalization under dynamic environmental conditions. Experimental results demonstrate that DPPOIL significantly outperforms baseline methods: it increases task offloading volume by 23.6%, reduces average offloading latency by 31.4%, and decreases total UAV energy consumption by 27.8%. The framework effectively balances system efficiency and resource overhead, offering a scalable, distributed intelligent decision-making solution for low-latency, high-energy-efficiency space-air-ground integrated edge computing.

Technology Category

Application Category

📝 Abstract

Mobile edge computing (MEC) is a promising technique to improve the computational capacity of smart devices (SDs) in Internet of Things (IoT). However, the performance of MEC is restricted due to its fixed location and limited service scope. Hence, we investigate an unmanned aerial vehicle (UAV)-assisted MEC system, where multiple UAVs are dispatched and each UAV can simultaneously provide computing service for multiple SDs. To improve the performance of system, we formulated a UAV-based trajectory control and resource allocation multi-objective optimization problem (TCRAMOP) to simultaneously maximize the offloading number of UAVs and minimize total offloading delay and total energy consumption of UAVs by optimizing the flight paths of UAVs as well as the computing resource allocated to served SDs. Then, consider that the solution of TCRAMOP requires continuous decision-making and the system is dynamic, we propose an enhanced deep reinforcement learning (DRL) algorithm, namely, distributed proximal policy optimization with imitation learning (DPPOIL). This algorithm incorporates the generative adversarial imitation learning technique to improve the policy performance. Simulation results demonstrate the effectiveness of our proposed DPPOIL and prove that the learned strategy of DPPOIL is better compared with other baseline methods.

Problem

Research questions and friction points this paper is trying to address.

Optimize UAV trajectory and resource allocation in MEC systems

Maximize offloading number and minimize delay and energy

Develop enhanced DRL algorithm for dynamic decision-making

Innovation

Methods, ideas, or system contributions that make the work stand out.

UAV-assisted MEC system for dynamic service coverage

DRL-based trajectory and resource multi-objective optimization

DPPOIL algorithm enhances policy with imitation learning

🔎 Similar Papers

No similar papers found.