Heterogeneous AAV Logistics Task Allocation: A Reinforcement Learning Enhanced Overlapping Coalition Formation Game Approach

πŸ“… 2026-05-25
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the challenge of stochastic time-sensitive task allocation for heterogeneous aerial autonomous vehicles (AAVs) in dynamic urban logistics. The authors propose a novel dynamic task assignment framework that integrates reinforcement learning with overlapping coalition formation games. A key innovation is the incorporation of a Transformer architecture into the soft actor-critic network to encode variable-length logistics states and capture spatiotemporal dependencies among tasks, thereby replacing conventional heuristic rules. The designed coalition formation mechanism is rigorously shown to constitute an exact potential game, guaranteeing convergence to a Nash-stable equilibrium within a finite number of steps. Experimental results demonstrate that, in a scenario involving 32 AAVs and 80 tasks, the proposed method reduces generalized logistics costs by 39.76% compared to a heuristic overlapping coalition formation baseline. Both high-fidelity simulations and indoor flight experiments validate the approach’s effectiveness and practical applicability.
πŸ“ Abstract
In dynamic urban logistics, the stochastic emergence of time-sensitive tasks poses a significant optimality challenge for heterogeneous AAVs logistics task allocation. To address this problem, a reinforcement learning enhanced overlapping coalition formation game approach is proposed. A dynamic task allocation model is established, where global optimality is mathematically quantified by a generalized logistics cost coupling service quality and resource consumption. To deal with the time-varying task sets induced by stochastic order arrivals, a transformer-based soft actor-critic network is designed. By leveraging multi-head self-attention to encode variable-length logistics states and capture task-wise spatiotemporal dependencies, the learned policy adaptively guides coalition updates, replacing heuristic rules in the overlapping coalition formation game. On this basis, heterogeneous AAVs can form more efficient overlapping coalitions for dynamic logistics tasks. The resulting coalition formation process is proven to constitute an exact potential game, which guarantees convergence to a Nash-stable equilibrium within a finite number of iterations. Numerical simulations demonstrate that the proposed algorithm effectively improves the optimality of task allocation under the generalized logistics cost criterion. In a scenario with 32 AAVs and 80 tasks, our algorithm achieves a 39.76% cost reduction compared with the heuristic OCF baseline. Indoor flight experiments further validate its practicality.
Problem

Research questions and friction points this paper is trying to address.

heterogeneous AAVs
logistics task allocation
stochastic tasks
time-sensitive tasks
dynamic urban logistics
Innovation

Methods, ideas, or system contributions that make the work stand out.

overlapping coalition formation
reinforcement learning
transformer-based SAC
heterogeneous AAVs
potential game
Y
Yuze Zhou
Beijing Institute of Technology, Beijing 100081, China; Key Laboratory of Dynamics and Control of Flight Vehicle, Ministry of Education, Beijing 100081, China; Beijing Institute of Technology Chongqing Innovation Center, Chongqing 401121, China; National Key Laboratory of Land and Air-based Information Perception and Control, Beijing 100081, China
J
Jingliang Sun
Beijing Institute of Technology, Beijing 100081, China; Key Laboratory of Dynamics and Control of Flight Vehicle, Ministry of Education, Beijing 100081, China; Beijing Institute of Technology Chongqing Innovation Center, Chongqing 401121, China; National Key Laboratory of Land and Air-based Information Perception and Control, Beijing 100081, China
Junzhi Li
Junzhi Li
Peking University
Evolutionary ComputationNeural NetworkMachine Learning
Jianxin Zhong
Jianxin Zhong
Professor, Shanghai University & Xiangtan University
Condensed matter and materials physics
Zihan Wang
Zihan Wang
University of Electronic Science and Technology of China
AI SecurityLLM Security
Teng Long
Teng Long
University of Amsterdam
Hyperbolic LearningQuantizationRetrieval