🤖 AI Summary
This paper addresses dynamic task allocation for heterogeneous drone fleets in on-demand aerial logistics, where battery capacities are unknown and energy consumption models are unavailable. We propose a decentralized auction framework integrating multi-agent online learning (UCB-based policies) with real-time energy state estimation. Our key innovation is “confidence-counterintuitive scheduling”: orders are prioritized to the drone with the lowest bidding confidence, while low-energy drones are permitted to commit to future fulfillment—enabling cross-temporal energy-aware dispatching. Simulation results demonstrate that, compared to fixed-threshold baselines, our approach reduces average delivery time by 19%, increases successful order rate by 27%, and exhibits strong long-term deployment robustness under uncertain energy dynamics.
📝 Abstract
Unmanned Aerial Vehicles (UAVs) are expected to transform logistics, reducing delivery time, costs, and emissions. This study addresses an on-demand delivery , in which fleets of UAVs are deployed to fulfil orders that arrive stochastically. Unlike previous work, it considers UAVs with heterogeneous, unknown energy storage capacities and assumes no knowledge of the energy consumption models. We propose a decentralised deployment strategy that combines auction-based task allocation with online learning. Each UAV independently decides whether to bid for orders based on its energy storage charge level, the parcel mass, and delivery distance. Over time, it refines its policy to bid only for orders within its capability. Simulations using realistic UAV energy models reveal that, counter-intuitively, assigning orders to the least confident bidders reduces delivery times and increases the number of successfully fulfilled orders. This strategy is shown to outperform threshold-based methods which require UAVs to exceed specific charge levels at deployment. We propose a variant of the strategy which uses learned policies for forecasting. This enables UAVs with insufficient charge levels to commit to fulfilling orders at specific future times, helping to prioritise early orders. Our work provides new insights into long-term deployment of UAV swarms, highlighting the advantages of decentralised energy-aware decision-making coupled with online learning in real-world dynamic environments.