🤖 AI Summary
To address the challenges of underactuation, low trajectory tracking accuracy under high-maneuver conditions, and poor sim-to-real transferability in quadrotor control, this paper proposes an end-to-end deep reinforcement learning framework that directly maps system states and high-level intentions to low-level control commands, enabling agile, reactive, coplanar aerobatic flight. We introduce a novel fully automated curriculum learning strategy that dynamically adjusts training difficulty, and integrate domain randomization with robust policy optimization to achieve zero-shot sim-to-real transfer. To the best of our knowledge, this is the first work to demonstrate sustained inverted flight and real-time navigation through moving doorframes on a physical quadrotor platform. Experimental results show significant improvements in response latency, trajectory tracking precision, and environmental adaptability, thereby overcoming performance bottlenecks inherent in conventional hierarchical control architectures under dynamic extreme maneuvers.
📝 Abstract
Quadrotors have demonstrated remarkable versatility, yet their full aerobatic potential remains largely untapped due to inherent underactuation and the complexity of aggressive maneuvers. Traditional approaches, separating trajectory optimization and tracking control, suffer from tracking inaccuracies, computational latency, and sensitivity to initial conditions, limiting their effectiveness in dynamic, high-agility scenarios. Inspired by recent breakthroughs in data-driven methods, we propose a reinforcement learning-based framework that directly maps drone states and aerobatic intentions to control commands, eliminating modular separation to enable quadrotors to perform end-to-end policy optimization for extreme aerobatic maneuvers. To ensure efficient and stable training, we introduce an automated curriculum learning strategy that dynamically adjusts aerobatic task difficulty. Enabled by domain randomization for robust zero-shot sim-to-real transfer, our approach is validated in demanding real-world experiments, including the first demonstration of a drone autonomously performing continuous inverted flight while reactively navigating a moving gate, showcasing unprecedented agility.