Robust Multi-Agent Reinforcement Learning for Small UAS Separation Assurance under GPS Degradation and Spoofing

📅 2026-03-30

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This study addresses the challenge of maintaining safe separation among small unmanned aircraft systems (sUAS) under GPS signal disruption or spoofing attacks. The authors model observation perturbations as a zero-sum game between agents and an adversary and derive a closed-form expression for worst-case perturbations—accurate to second order—that requires no adversarial training. This expression is integrated into a multi-agent reinforcement learning policy gradient algorithm. Theoretical analysis demonstrates that safety performance degrades linearly with perturbation probability. Experimental results in high-density sUAS scenarios show that, even with up to 35% corrupted observations, the proposed method achieves near-zero collision rates, substantially outperforming non-adversarial baseline approaches.

Technology Category

Application Category

📝 Abstract

We address robust separation assurance for small Unmanned Aircraft Systems (sUAS) under GPS degradation and spoofing via Multi-Agent Reinforcement Learning (MARL). In cooperative surveillance, each aircraft (or agent) broadcasts its GPS-derived position; when such position broadcasts are corrupted, the entire observed air traffic state becomes unreliable. We cast this state observation corruption as a zero-sum game between the agents and an adversary: with probability R, the adversary perturbs the observed state to maximally degrade each agent's safety performance. We derive a closed-form expression for this adversarial perturbation, bypassing adversarial training entirely and enabling linear-time evaluation in the state dimension. We show that this expression approximates the true worst-case adversarial perturbation with second-order accuracy. We further bound the safety performance gap between clean and corrupted observations, showing that it degrades at most linearly with the corruption probability under Kullback-Leibler regularization. Finally, we integrate the closed-form adversarial policy into a MARL policy gradient algorithm to obtain a robust counter-policy for the agents. In a high-density sUAS simulation, we observe near-zero collision rates under corruption levels up to 35%, outperforming a baseline policy trained without adversarial perturbations.

Problem

Research questions and friction points this paper is trying to address.

small UAS

separation assurance

GPS spoofing

state observation corruption

multi-agent reinforcement learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarial Robustness

Multi-Agent Reinforcement Learning

Closed-Form Perturbation