🤖 AI Summary
To address the challenge of achieving large workspace coverage and high-precision end-effector pose tracking for quadrupedal robots on unstructured, rough terrain, this paper proposes a reinforcement learning (RL)-based whole-body dynamics control framework. Our key contributions are: (1) a terrain-aware initial configuration sampling strategy that enhances the robustness of whole-body motion planning against terrain disturbances; and (2) a gamified curriculum learning mechanism that jointly optimizes locomotion stability and end-effector tracking accuracy, thereby overcoming dual bottlenecks—limited operational workspace and insufficient pose tracking precision—prevalent in existing RL approaches. Experiments on the ANYmal platform demonstrate an average end-effector position error of 2.64 cm and orientation error of 3.64°, significantly outperforming baseline methods. Moreover, the learned policy generalizes successfully to diverse complex terrains, including stairs and inclined slopes.
📝 Abstract
Combining manipulation with the mobility of legged robots is essential for a wide range of robotic applications. However, integrating an arm with a mobile base significantly increases the system's complexity, making precise end-effector control challenging. Existing model-based approaches are often constrained by their modeling assumptions, leading to limited robustness. Meanwhile, recent Reinforcement Learning (RL) implementations restrict the arm's workspace to be in front of the robot or track only the position to obtain decent tracking accuracy. In this work, we address these limitations by introducing a whole-body RL formulation for end-effector pose tracking in a large workspace on rough, unstructured terrains. Our proposed method involves a terrain-aware sampling strategy for the robot's initial configuration and end-effector pose commands, as well as a game-based curriculum to extend the robot's operating range. We validate our approach on the ANYmal quadrupedal robot with a six DoF robotic arm. Through our experiments, we show that the learned controller achieves precise command tracking over a large workspace and adapts across varying terrains such as stairs and slopes. On deployment, it achieves a pose-tracking error of 2.64 cm and 3.64 degrees, outperforming existing competitive baselines.