🤖 AI Summary
To address the need for adaptive and robust control in dynamic environments for autonomous micro-drones pursuing multiple objectives—energy efficiency, high-precision localization, and high-speed navigation—this paper proposes AirPilot: an interpretable deep reinforcement learning (DRL)-enhanced nonlinear PID controller based on Proximal Policy Optimization (PPO). Methodologically, AirPilot integrates DRL with classical PID control via a novel deep fusion architecture enabling online, automatic parameter tuning. It is trained in Gazebo simulation and deployed end-to-end on real hardware—the COEX Clover micro-drone—using ROS/PX4 hardware-in-the-loop integration. Key contributions include the first deep architectural integration of DRL and PID for online adaptation, and the first real-world validation of such a hybrid controller on a resource-constrained micro-UAV. Experimental results demonstrate that AirPilot reduces navigation error by 90% compared to the default PX4 PID, while improving effective navigation speed by 21%, reducing settling time by 17%, and decreasing overshoot by 16% relative to manually tuned PID.
📝 Abstract
Navigation precision, speed and stability are crucial for safe Unmanned Aerial Vehicle (UAV) flight maneuvers and effective flight mission executions in dynamic environments. Different flight missions may have varying objectives, such as minimizing energy consumption, achieving precise positioning, or maximizing speed. A controller that can adapt to different objectives on the fly is highly valuable. Proportional Integral Derivative (PID) controllers are one of the most popular and widely used control algorithms for drones and other control systems, but their linear control algorithm fails to capture the nonlinear nature of the dynamic wind conditions and complex drone system. Manually tuning the PID gains for various missions can be time-consuming and requires significant expertise. This paper aims to revolutionize drone flight control by presenting the AirPilot, a nonlinear Deep Reinforcement Learning (DRL) - enhanced Proportional Integral Derivative (PID) drone controller using Proximal Policy Optimization (PPO). AirPilot controller combines the simplicity and effectiveness of traditional PID control with the adaptability, learning capability, and optimization potential of DRL. This makes it better suited for modern drone applications where the environment is dynamic, and mission-specific performance demands are high. We employed a COEX Clover autonomous drone for training the DRL agent within the simulator and implemented it in a real-world lab setting, which marks a significant milestone as one of the first attempts to apply a DRL-based flight controller on an actual drone. Airpilot is capable of reducing the navigation error of the default PX4 PID position controller by 90%, improving effective navigation speed of a fine-tuned PID controller by 21%, reducing settling time and overshoot by 17% and 16% respectively.