π€ AI Summary
This study addresses the challenge of stochastic time-sensitive task allocation for heterogeneous aerial autonomous vehicles (AAVs) in dynamic urban logistics. The authors propose a novel dynamic task assignment framework that integrates reinforcement learning with overlapping coalition formation games. A key innovation is the incorporation of a Transformer architecture into the soft actor-critic network to encode variable-length logistics states and capture spatiotemporal dependencies among tasks, thereby replacing conventional heuristic rules. The designed coalition formation mechanism is rigorously shown to constitute an exact potential game, guaranteeing convergence to a Nash-stable equilibrium within a finite number of steps. Experimental results demonstrate that, in a scenario involving 32 AAVs and 80 tasks, the proposed method reduces generalized logistics costs by 39.76% compared to a heuristic overlapping coalition formation baseline. Both high-fidelity simulations and indoor flight experiments validate the approachβs effectiveness and practical applicability.
π Abstract
In dynamic urban logistics, the stochastic emergence of time-sensitive tasks poses a significant optimality challenge for heterogeneous AAVs logistics task allocation. To address this problem, a reinforcement learning enhanced overlapping coalition formation game approach is proposed. A dynamic task allocation model is established, where global optimality is mathematically quantified by a generalized logistics cost coupling service quality and resource consumption. To deal with the time-varying task sets induced by stochastic order arrivals, a transformer-based soft actor-critic network is designed. By leveraging multi-head self-attention to encode variable-length logistics states and capture task-wise spatiotemporal dependencies, the learned policy adaptively guides coalition updates, replacing heuristic rules in the overlapping coalition formation game. On this basis, heterogeneous AAVs can form more efficient overlapping coalitions for dynamic logistics tasks. The resulting coalition formation process is proven to constitute an exact potential game, which guarantees convergence to a Nash-stable equilibrium within a finite number of iterations. Numerical simulations demonstrate that the proposed algorithm effectively improves the optimality of task allocation under the generalized logistics cost criterion. In a scenario with 32 AAVs and 80 tasks, our algorithm achieves a 39.76% cost reduction compared with the heuristic OCF baseline. Indoor flight experiments further validate its practicality.