🤖 AI Summary
This work addresses the energy-constrained and time-sensitive multi-UAV last-mile delivery problem. We propose a multi-agent reinforcement learning framework based on the Actor-Critic architecture to jointly optimize base station deployment, flight zone partitioning, and dynamic path planning. Our approach innovatively integrates K-means clustering for pre-grouping, multi-agent cooperative decision-making, and adaptive path-policy selection, enabling joint long-term optimization of energy efficiency and latency. The framework is trained and validated on real-world logistics data. Experimental results demonstrate that, compared to baseline methods, our solution reduces average delivery latency by 23.6% and energy consumption by 18.4%. Moreover, the learned base station deployment strategy exhibits strong transferability and practical deployability. This work establishes a novel paradigm for intelligent, cooperative UAV scheduling in sustainable urban logistics systems.
📝 Abstract
Drones have recently emerged as a faster, safer, and cost-efficient way for last-mile deliveries of parcels, particularly for urgent medical deliveries highlighted during the pandemic. This paper addresses a new challenge of multi-parcel delivery with a swarm of energy-aware drones, accounting for time-sensitive customer requirements. Each drone plans an optimal multi-parcel route within its battery-restricted flight range to minimize delivery delays and reduce energy consumption. The problem is tackled by decomposing it into three sub-problems: (1) optimizing depot locations and service areas using K-means clustering; (2) determining the optimal flight range for drones through reinforcement learning; and (3) planning and selecting multi-parcel delivery routes via a new optimized plan selection approach. To integrate these solutions and enhance long-term efficiency, we propose a novel algorithm leveraging actor-critic-based multi-agent deep reinforcement learning. Extensive experimentation using realistic delivery datasets demonstrate an exceptional performance of the proposed algorithm. We provide new insights into economic efficiency (minimize energy consumption), rapid operations (reduce delivery delays and overall execution time), and strategic guidance on depot deployment for practical logistics applications.