🤖 AI Summary
Dexterous manipulation of articulated tools (e.g., tweezers, scissors) by anthropomorphic robotic hands is challenging due to dynamic tool configuration changes, which complicate perception and control.
Method: We propose a hierarchical goal-conditioned reinforcement learning framework. A low-level policy executes fine-grained motion control, while a high-level policy, guided by a tool-efficiency state encoder, perceives and adapts to real-time tool configuration. A privileged-information-augmented heuristic replay buffer accelerates training. The method integrates point-cloud encoding, synthetic-data pretraining, and goal-conditioned policies.
Contribution/Results: This design significantly improves generalization to unseen object shapes and sizes. On a physical robot platform, the system achieves a 70.8% success rate in tweezer-like tool manipulation, demonstrating effectiveness and practicality for dexterous handling of complex articulated tools.
📝 Abstract
Manipulating articulated tools, such as tweezers or scissors, has rarely been explored in previous research. Unlike rigid tools, articulated tools change their shape dynamically, creating unique challenges for dexterous robotic hands. In this work, we present a hierarchical, goal-conditioned reinforcement learning (GCRL) framework to improve the manipulation capabilities of anthropomorphic robotic hands using articulated tools. Our framework comprises two policy layers: (1) a low-level policy that enables the dexterous hand to manipulate the tool into various configurations for objects of different sizes, and (2) a high-level policy that defines the tool's goal state and controls the robotic arm for object-picking tasks. We employ an encoder, trained on synthetic pointclouds, to estimate the tool's affordance states--specifically, how different tool configurations (e.g., tweezer opening angles) enable grasping of objects of varying sizes--from input point clouds, thereby enabling precise tool manipulation. We also utilize a privilege-informed heuristic policy to generate replay buffer, improving the training efficiency of the high-level policy. We validate our approach through real-world experiments, showing that the robot can effectively manipulate a tweezer-like tool to grasp objects of diverse shapes and sizes with a 70.8 % success rate. This study highlights the potential of RL to advance dexterous robotic manipulation of articulated tools.