Learning Reactive Dexterous Grasping via Hierarchical Task-Space RL Planning and Joint-Space QP Control

📅 2026-05-05

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the challenge of dexterous grasping in unstructured environments with unknown objects and dynamic disturbances by proposing a structured, decoupled hierarchical control framework. At the high level, multi-agent reinforcement learning generates task-space velocity commands, which are then mapped by a GPU-accelerated, parallelized quadratic programming (QP) controller at the low level into joint velocities that strictly satisfy physical and safety constraints. This architecture enables accelerated training, enforces hard safety guarantees on hardware, and supports zero-shot generalization—allowing dynamic adjustment of obstacle avoidance margins for novel obstacles without retraining. Experiments on a 7-DOF robotic arm and a 20-DOF anthropomorphic hand demonstrate that the system achieves robust, real-time dexterous grasping of unseen objects and successfully executes complex manipulation tasks in real-world scenarios.

📝 Abstract

In this work, we propose a hybrid hierarchical control framework for reactive dexterous grasping that explicitly decouples high-level spatial intent from low-level joint execution. We introduce a multi-agent reinforcement learning architecture, specialized into distinct arm and hand agents, that acts as a high-level planner by generating desired task-space velocity commands. These commands are then processed by a GPU-parallelized quadratic programming controller, which translates them into feasible joint velocities while strictly enforcing kinematic limits and collision avoidance. This structural isolation not only accelerates training convergence but also strictly enforces hardware safety. Furthermore, the architecture unlocks zero-shot steerability, allowing system operators to dynamically adjust safety margins and avoid dynamic obstacles without retraining the policy. We extensively validate the proposed framework through a rigorous simulation-to-reality pipeline. Real-world hardware experiments on a 7-DoF arm equipped with a 20-DoF anthropomorphic hand demonstrate highly robust zero-shot transferability for dexterous grasping to a diverse set of unseen objects, highlighting the system's ability to reactively recover from unexpected physical disturbances in unstructured environments.

Problem

Research questions and friction points this paper is trying to address.

reactive grasping

dexterous manipulation

zero-shot transfer

collision avoidance

hardware safety

Innovation

Methods, ideas, or system contributions that make the work stand out.

hierarchical reinforcement learning

quadratic programming control

dexterous grasping