🤖 AI Summary
End-to-end autonomous aerial vehicle (AAV) navigation suffers from an inherent tension between low-frequency perception—constrained by sensor limitations and computational overhead—and high-frequency control requirements, rendering conventional synchronous frameworks inadequate for simultaneously achieving agility and robustness. To address this, we propose the first theoretically rigorous asynchronous end-to-end learning framework that decouples perception and control: a Temporal Encoding Module (TEM) explicitly models perceptual latency, while a two-stage curriculum learning strategy enables coordinated asynchronous perception–control training. Our method integrates asynchronous reinforcement learning, real-time IMU state feedback, and latency-aware feature fusion. After simulation validation, the policy transfers zero-shot to an onboard Intel NUC platform, sustaining a 100 Hz control frequency. It achieves stable, agile flight in dense real-world environments, significantly enhancing navigation robustness and adaptability under complex conditions.
📝 Abstract
Robust autonomous navigation for Autonomous Aerial Vehicles (AAVs) in complex environments is a critical capability. However, modern end-to-end navigation faces a key challenge: the high-frequency control loop needed for agile flight conflicts with low-frequency perception streams, which are limited by sensor update rates and significant computational cost. This mismatch forces conventional synchronous models into undesirably low control rates. To resolve this, we propose an asynchronous reinforcement learning framework that decouples perception and control, enabling a high-frequency policy to act on the latest IMU state for immediate reactivity, while incorporating perception features asynchronously. To manage the resulting data staleness, we introduce a theoretically-grounded Temporal Encoding Module (TEM) that explicitly conditions the policy on perception delays, a strategy complemented by a two-stage curriculum to ensure stable and efficient training. Validated in extensive simulations, our method was successfully deployed in zero-shot sim-to-real transfer on an onboard NUC, where it sustains a 100~Hz control rate and demonstrates robust, agile navigation in cluttered real-world environments. Our source code will be released for community reference.