Robust Multi-Agent Reinforcement Learning for Small UAS Separation Assurance under GPS Degradation and Spoofing

📅 2026-03-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of maintaining safe separation among small unmanned aircraft systems (sUAS) under GPS signal disruption or spoofing attacks. The authors model observation perturbations as a zero-sum game between agents and an adversary and derive a closed-form expression for worst-case perturbations—accurate to second order—that requires no adversarial training. This expression is integrated into a multi-agent reinforcement learning policy gradient algorithm. Theoretical analysis demonstrates that safety performance degrades linearly with perturbation probability. Experimental results in high-density sUAS scenarios show that, even with up to 35% corrupted observations, the proposed method achieves near-zero collision rates, substantially outperforming non-adversarial baseline approaches.
📝 Abstract
We address robust separation assurance for small Unmanned Aircraft Systems (sUAS) under GPS degradation and spoofing via Multi-Agent Reinforcement Learning (MARL). In cooperative surveillance, each aircraft (or agent) broadcasts its GPS-derived position; when such position broadcasts are corrupted, the entire observed air traffic state becomes unreliable. We cast this state observation corruption as a zero-sum game between the agents and an adversary: with probability R, the adversary perturbs the observed state to maximally degrade each agent's safety performance. We derive a closed-form expression for this adversarial perturbation, bypassing adversarial training entirely and enabling linear-time evaluation in the state dimension. We show that this expression approximates the true worst-case adversarial perturbation with second-order accuracy. We further bound the safety performance gap between clean and corrupted observations, showing that it degrades at most linearly with the corruption probability under Kullback-Leibler regularization. Finally, we integrate the closed-form adversarial policy into a MARL policy gradient algorithm to obtain a robust counter-policy for the agents. In a high-density sUAS simulation, we observe near-zero collision rates under corruption levels up to 35%, outperforming a baseline policy trained without adversarial perturbations.
Problem

Research questions and friction points this paper is trying to address.

small UAS
separation assurance
GPS spoofing
state observation corruption
multi-agent reinforcement learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarial Robustness
Multi-Agent Reinforcement Learning
Closed-Form Perturbation
GPS Spoofing
Separation Assurance
🔎 Similar Papers
No similar papers found.