Separation Assurance between Heterogeneous Fleets of Small Unmanned Aerial Systems via Multi-Agent Reinforcement Learning

📅 2026-05-01

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This study addresses the challenges of coordination complexity and fairness in the cooperative operation of heterogeneous small unmanned aircraft systems (sUAS) fleets within dense urban airspace. The authors propose PPOA2C, an attention-augmented multi-agent reinforcement learning algorithm that enables independent fleet training and privacy-preserving collaborative conflict resolution. For the first time, they demonstrate that heterogeneous policies can converge to a safe equilibrium and uncover how fleet composition and policy types influence fairness outcomes. Experimental results in a Dallas package delivery scenario show that the proposed method outperforms rule-based baselines by achieving efficient and safe separation; however, the resulting equilibria tend to favor fleets with superior performance capabilities or specific policy types, offering new insights for fairness-aware airspace management.

📝 Abstract

In the envisioned future dense urban airspace, multiple companies will operate heterogeneous fleets of small unmanned aerial systems (sUASs), where each fleet includes several homogeneous aircraft with identical policies and configurations, e.g., equipage, sensing, and communication ranges, making tactical deconfliction highly complex for the aircraft. This paper aims to address two core questions: (1) Can tactical deconfliction policies converge or reach an equilibrium to ensure a conflict-free airspace when companies operate heterogeneous fleets of homogeneous aircraft? (2) If so, will the converged policies discriminate against companies operating sUASs with weaker configurations? We investigate a multi-agent reinforcement learning paradigm in which homogeneous aircraft within heterogeneous fleets operate concurrently to perform package delivery missions over Dallas, Texas, USA. An attention-enhanced Proximal Policy Optimization-based Advantage Actor-Critic (PPOA2C) framework is employed to resolve intra- and inter-fleet conflicts, with each fleet independently training its own policy while preserving privacy. Experimental results show that two fleets with distinct, shared PPOA2C policies can reach an equilibrium to maintain safe separation. While two PPOA2C policies outperform two strong rule-based baselines in terms of conflict resolution, a PPOA2C policy exhibits safer interaction with a rule-based policy, indicating adaptive capabilities of PPOA2C policies. Furthermore, we conducted extensive policy-configuration evaluations, which reveal that equilibria between similar policy types tend to favor fleets with stronger configurations. Even under similar configurations but different policy types, the equilibrium favors one of the heterogeneous policies, underscoring the need for fairness-aware conflict management in heterogeneous sUAS operations.

Problem

Research questions and friction points this paper is trying to address.

Separation Assurance

Heterogeneous Fleets

Small Unmanned Aerial Systems

Tactical Deconfliction

Fairness

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent reinforcement learning

heterogeneous fleets

separation assurance