🤖 AI Summary
Existing learning-based quadrotor navigation methods perform well in narrow obstacle environments but suffer significant performance degradation—particularly reduced success rates—when targets are fully occluded by large-scale static obstacles such as walls or terrain. To address this, we propose a reinforcement learning (RL) navigation framework that integrates privileged information with differentiable simulation: a time-to-goal map is introduced as a privileged input during training, and a yaw-alignment loss function is designed to explicitly guide the agent in circumnavigating large obstacles. Evaluated in photorealistic simulation, our method achieves an 86% navigation success rate—34 percentage points higher than baseline methods. In real-world outdoor flight tests (20 trials, 589 m total distance, up to 4 m/s speed), it operates collision-free and stably under both daytime and nighttime conditions. Our core contribution lies in jointly embedding structured spatial priors (time-to-goal maps) and kinematic constraints (yaw alignment) into RL training, substantially improving robustness in occlusion-prone environments.
📝 Abstract
This paper presents a reinforcement learning-based quadrotor navigation method that leverages efficient differentiable simulation, novel loss functions, and privileged information to navigate around large obstacles. Prior learning-based methods perform well in scenes that exhibit narrow obstacles, but struggle when the goal location is blocked by large walls or terrain. In contrast, the proposed method utilizes time-of-arrival (ToA) maps as privileged information and a yaw alignment loss to guide the robot around large obstacles. The policy is evaluated in photo-realistic simulation environments containing large obstacles, sharp corners, and dead-ends. Our approach achieves an 86% success rate and outperforms baseline strategies by 34%. We deploy the policy onboard a custom quadrotor in outdoor cluttered environments both during the day and night. The policy is validated across 20 flights, covering 589 meters without collisions at speeds up to 4 m/s.