🤖 AI Summary
To address insufficient attitude control accuracy of quadrotor UAVs and the limitations of conventional PID controllers—namely, manual gain tuning and poor adaptability to dynamic environments—this paper proposes an online adaptive PID gain optimization method based on Deep Deterministic Policy Gradient (DDPG). For the first time, DDPG is employed to perform real-time, closed-loop adjustment of PID gains (proportional, integral, and derivative) during flight, using state feedback for continuous parameter updates. High-fidelity simulation is conducted in MATLAB/Simulink integrated with the PX4 UAV Toolbox. Experimental results demonstrate significant reductions in attitude tracking error, markedly improved trajectory tracking accuracy and robustness, and clear superiority over manually tuned PID baselines. This work establishes a deployable, reinforcement learning–driven paradigm for intelligent, adaptive PID tuning in autonomous UAV control.
📝 Abstract
A reinforcement learning (RL) based methodology is proposed and implemented for online fine-tuning of PID controller gains, thus, improving quadrotor effective and accurate trajectory tracking. The RL agent is first trained offline on a quadrotor PID attitude controller and then validated through simulations and experimental flights. RL exploits a Deep Deterministic Policy Gradient (DDPG) algorithm, which is an off-policy actor-critic method. Training and simulation studies are performed using Matlab/Simulink and the UAV Toolbox Support Package for PX4 Autopilots. Performance evaluation and comparison studies are performed between the hand-tuned and RL-based tuned approaches. The results show that the controller parameters based on RL are adjusted during flights, achieving the smallest attitude errors, thus significantly improving attitude tracking performance compared to the hand-tuned approach.