Quadrotor Navigation using Reinforcement Learning with Privileged Information

📅 2025-09-09

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

Existing learning-based quadrotor navigation methods perform well in narrow obstacle environments but suffer significant performance degradation—particularly reduced success rates—when targets are fully occluded by large-scale static obstacles such as walls or terrain. To address this, we propose a reinforcement learning (RL) navigation framework that integrates privileged information with differentiable simulation: a time-to-goal map is introduced as a privileged input during training, and a yaw-alignment loss function is designed to explicitly guide the agent in circumnavigating large obstacles. Evaluated in photorealistic simulation, our method achieves an 86% navigation success rate—34 percentage points higher than baseline methods. In real-world outdoor flight tests (20 trials, 589 m total distance, up to 4 m/s speed), it operates collision-free and stably under both daytime and nighttime conditions. Our core contribution lies in jointly embedding structured spatial priors (time-to-goal maps) and kinematic constraints (yaw alignment) into RL training, substantially improving robustness in occlusion-prone environments.

Technology Category

Application Category

📝 Abstract

This paper presents a reinforcement learning-based quadrotor navigation method that leverages efficient differentiable simulation, novel loss functions, and privileged information to navigate around large obstacles. Prior learning-based methods perform well in scenes that exhibit narrow obstacles, but struggle when the goal location is blocked by large walls or terrain. In contrast, the proposed method utilizes time-of-arrival (ToA) maps as privileged information and a yaw alignment loss to guide the robot around large obstacles. The policy is evaluated in photo-realistic simulation environments containing large obstacles, sharp corners, and dead-ends. Our approach achieves an 86% success rate and outperforms baseline strategies by 34%. We deploy the policy onboard a custom quadrotor in outdoor cluttered environments both during the day and night. The policy is validated across 20 flights, covering 589 meters without collisions at speeds up to 4 m/s.

Problem

Research questions and friction points this paper is trying to address.

Quadrotor navigation around large obstacles using reinforcement learning

Overcoming limitations of prior methods with large walls and terrain

Utilizing time-of-arrival maps and yaw alignment for guidance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning with privileged information

Time-of-arrival maps and yaw alignment

Differentiable simulation for navigation training

🔎 Similar Papers

A Reinforcement Learning Based Motion Planner for Quadrotor Autonomous Flight in Dense Environment