Reactive Aerobatic Flight via Reinforcement Learning

📅 2025-05-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of underactuation, low trajectory tracking accuracy under high-maneuver conditions, and poor sim-to-real transferability in quadrotor control, this paper proposes an end-to-end deep reinforcement learning framework that directly maps system states and high-level intentions to low-level control commands, enabling agile, reactive, coplanar aerobatic flight. We introduce a novel fully automated curriculum learning strategy that dynamically adjusts training difficulty, and integrate domain randomization with robust policy optimization to achieve zero-shot sim-to-real transfer. To the best of our knowledge, this is the first work to demonstrate sustained inverted flight and real-time navigation through moving doorframes on a physical quadrotor platform. Experimental results show significant improvements in response latency, trajectory tracking precision, and environmental adaptability, thereby overcoming performance bottlenecks inherent in conventional hierarchical control architectures under dynamic extreme maneuvers.

Technology Category

Application Category

📝 Abstract
Quadrotors have demonstrated remarkable versatility, yet their full aerobatic potential remains largely untapped due to inherent underactuation and the complexity of aggressive maneuvers. Traditional approaches, separating trajectory optimization and tracking control, suffer from tracking inaccuracies, computational latency, and sensitivity to initial conditions, limiting their effectiveness in dynamic, high-agility scenarios. Inspired by recent breakthroughs in data-driven methods, we propose a reinforcement learning-based framework that directly maps drone states and aerobatic intentions to control commands, eliminating modular separation to enable quadrotors to perform end-to-end policy optimization for extreme aerobatic maneuvers. To ensure efficient and stable training, we introduce an automated curriculum learning strategy that dynamically adjusts aerobatic task difficulty. Enabled by domain randomization for robust zero-shot sim-to-real transfer, our approach is validated in demanding real-world experiments, including the first demonstration of a drone autonomously performing continuous inverted flight while reactively navigating a moving gate, showcasing unprecedented agility.
Problem

Research questions and friction points this paper is trying to address.

Enabling quadrotors to perform extreme aerobatic maneuvers
Overcoming tracking inaccuracies in dynamic high-agility scenarios
Achieving robust zero-shot sim-to-real transfer for drones
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning maps states to commands
Automated curriculum adjusts task difficulty
Domain randomization enables sim-to-real transfer
🔎 Similar Papers
No similar papers found.
Zhichao Han
Zhichao Han
Zhejiang University, PHD Student
Robotics
Xijie Huang
Xijie Huang
Hong Kong University of Science and Technology
Efficient Deep LearningModel Compression
Z
Zhuxiu Xu
Huzhou Institute, Zhejiang University, Huzhou 313000, China
J
Jiarui Zhang
State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou 310027, China; Huzhou Institute, Zhejiang University, Huzhou 313000, China
Yuze Wu
Yuze Wu
Zhejiang University
Control & PlanningRobot LearningEmbodied Intelligence
Mingyang Wang
Mingyang Wang
University of Munich (LMU Munich)
Natural Language Processing
Tianyue Wu
Tianyue Wu
Undergraduate, Zhejiang University
RoboticsRobot LearningOptimizationAerial RobotsDexterous Manipulation
F
Fei Gao
State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou 310027, China; Huzhou Institute, Zhejiang University, Huzhou 313000, China