Global End-Effector Pose Control of an Underactuated Aerial Manipulator via Reinforcement Learning

📅 2025-12-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Lightweight underactuated aerial manipulators struggle to achieve globally stable 6-DoF end-effector pose control under strong environmental disturbances. Method: This paper proposes a hierarchical control framework integrating Proximal Policy Optimization (PPO) reinforcement learning with Incremental Nonlinear Dynamic Inversion (INDI), enabling precise full 6-DoF end-effector control using only a 2-DoF onboard manipulator mounted on a quadrotor. Differential mechanism design and sim-to-real transfer training address challenges including limited joint degrees of freedom, model uncertainty, and sensitivity to contact-induced disturbances. Contribution/Results: Experiments demonstrate centimeter-level positional accuracy and degree-level orientation accuracy. The system successfully performs heavy-object transportation and stable contact tasks under significant thrust disturbances, markedly enhancing the robustness and practical applicability of underactuated aerial manipulators in real-world scenarios.

Technology Category

Application Category

📝 Abstract
Aerial manipulators, which combine robotic arms with multi-rotor drones, face strict constraints on arm weight and mechanical complexity. In this work, we study a lightweight 2-degree-of-freedom (DoF) arm mounted on a quadrotor via a differential mechanism, capable of full six-DoF end-effector pose control. While the minimal design enables simplicity and reduced payload, it also introduces challenges such as underactuation and sensitivity to external disturbances, including manipulation of heavy loads and pushing tasks. To address these, we employ reinforcement learning, training a Proximal Policy Optimization (PPO) agent in simulation to generate feedforward commands for quadrotor acceleration and body rates, along with joint angle targets. These commands are tracked by an incremental nonlinear dynamic inversion (INDI) attitude controller and a PID joint controller, respectively. Flight experiments demonstrate centimeter-level position accuracy and degree-level orientation precision, with robust performance under external force disturbances. The results highlight the potential of learning-based control strategies for enabling contact-rich aerial manipulation using simple, lightweight platforms.
Problem

Research questions and friction points this paper is trying to address.

Control underactuated aerial manipulator's end-effector pose
Handle external disturbances and heavy load manipulation
Achieve precise control with lightweight, simple mechanical design
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning trains PPO agent for feedforward commands
INDI attitude controller and PID joint controller track commands
Achieves centimeter-level position and degree-level orientation precision
🔎 Similar Papers
No similar papers found.
S
Shlok Deshmukh
Department of Cognitive Robotics, Faculty of Mechanical Engineering, Delft University of Technology, Delft, Netherlands
Javier Alonso-Mora
Javier Alonso-Mora
Associate Professor, Delft University of Technology
RoboticsIntelligent TransportationMotion PlanningMulti-Robot SystemsArtificial Intelligence
S
Sihao Sun
Department of Cognitive Robotics, Faculty of Mechanical Engineering, Delft University of Technology, Delft, Netherlands