Multi-critic Learning for Whole-body End-effector Twist Tracking

📅 2025-07-11

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Quadrupedal robots face conflicting objectives between whole-body locomotion and manipulator operation: stable base pose is required for locomotion, whereas end-effector trajectory tracking often necessitates base tilting to expand the reachable workspace; moreover, existing reinforcement learning (RL) methods relying on pose-based task specifications struggle to achieve smooth velocity tracking. Method: We propose a multi-critic Actor-Critic framework that decouples locomotion and manipulation reward signals, and introduce twist-based end-effector velocity tracking—enabling emergent whole-body coordination without explicit dynamic modeling. Contribution/Results: The method supports both discrete pose and continuous trajectory tracking, accommodates dynamic gaits and real-time manipulation, and is validated in simulation and on a physical quadrupedal manipulator. It achieves high-precision end-effector velocity tracking during locomotion and demonstrates adaptive base tilting to extend the operational workspace through coordinated whole-body motion.

Technology Category

Application Category

📝 Abstract

Learning whole-body control for locomotion and arm motions in a single policy has challenges, as the two tasks have conflicting goals. For instance, efficient locomotion typically favors a horizontal base orientation, while end-effector tracking may benefit from base tilting to extend reachability. Additionally, current Reinforcement Learning (RL) approaches using a pose-based task specification lack the ability to directly control the end-effector velocity, making smoothly executing trajectories very challenging. To address these limitations, we propose an RL-based framework that allows for dynamic, velocity-aware whole-body end-effector control. Our method introduces a multi-critic actor architecture that decouples the reward signals for locomotion and manipulation, simplifying reward tuning and allowing the policy to resolve task conflicts more effectively. Furthermore, we design a twist-based end-effector task formulation that can track both discrete poses and motion trajectories. We validate our approach through a set of simulation and hardware experiments using a quadruped robot equipped with a robotic arm. The resulting controller can simultaneously walk and move its end-effector and shows emergent whole-body behaviors, where the base assists the arm in extending the workspace, despite a lack of explicit formulations.

Problem

Research questions and friction points this paper is trying to address.

Conflicting goals in locomotion and arm motion control

Lack of direct end-effector velocity control in RL

Difficulty in tracking both poses and motion trajectories

Innovation

Methods, ideas, or system contributions that make the work stand out.

RL-based framework for dynamic velocity-aware control

Multi-critic actor architecture decouples reward signals

Twist-based end-effector task formulation tracks trajectories

🔎 Similar Papers

Omnigrasp: Grasping Diverse Objects with Simulated Humanoids