Quadrotor Navigation using Reinforcement Learning with Privileged Information

📅 2025-09-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing learning-based quadrotor navigation methods perform well in narrow obstacle environments but suffer significant performance degradation—particularly reduced success rates—when targets are fully occluded by large-scale static obstacles such as walls or terrain. To address this, we propose a reinforcement learning (RL) navigation framework that integrates privileged information with differentiable simulation: a time-to-goal map is introduced as a privileged input during training, and a yaw-alignment loss function is designed to explicitly guide the agent in circumnavigating large obstacles. Evaluated in photorealistic simulation, our method achieves an 86% navigation success rate—34 percentage points higher than baseline methods. In real-world outdoor flight tests (20 trials, 589 m total distance, up to 4 m/s speed), it operates collision-free and stably under both daytime and nighttime conditions. Our core contribution lies in jointly embedding structured spatial priors (time-to-goal maps) and kinematic constraints (yaw alignment) into RL training, substantially improving robustness in occlusion-prone environments.

Technology Category

Application Category

📝 Abstract
This paper presents a reinforcement learning-based quadrotor navigation method that leverages efficient differentiable simulation, novel loss functions, and privileged information to navigate around large obstacles. Prior learning-based methods perform well in scenes that exhibit narrow obstacles, but struggle when the goal location is blocked by large walls or terrain. In contrast, the proposed method utilizes time-of-arrival (ToA) maps as privileged information and a yaw alignment loss to guide the robot around large obstacles. The policy is evaluated in photo-realistic simulation environments containing large obstacles, sharp corners, and dead-ends. Our approach achieves an 86% success rate and outperforms baseline strategies by 34%. We deploy the policy onboard a custom quadrotor in outdoor cluttered environments both during the day and night. The policy is validated across 20 flights, covering 589 meters without collisions at speeds up to 4 m/s.
Problem

Research questions and friction points this paper is trying to address.

Quadrotor navigation around large obstacles using reinforcement learning
Overcoming limitations of prior methods with large walls and terrain
Utilizing time-of-arrival maps and yaw alignment for guidance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning with privileged information
Time-of-arrival maps and yaw alignment
Differentiable simulation for navigation training
🔎 Similar Papers
No similar papers found.
J
Jonathan Lee
Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213 USA
Abhishek Rathod
Abhishek Rathod
National Robotics Engineering Center
Autonomous Aerial VehiclesModel Predictive ControlMotion Planning
Kshitij Goel
Kshitij Goel
Carnegie Mellon University
Robotics
J
John Stecklein
Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213 USA
Wennie Tabib
Wennie Tabib
Carnegie Mellon University
RoboticsActive PerceptionSLAM