Concurrent Prehensile and Nonprehensile Manipulation: A Practical Approach to Multi-Stage Dexterous Tasks

📅 2026-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of performing long-horizon, multi-stage manipulation tasks with dexterous hands, which require alternating between grasping and non-grasping actions under complex contact dynamics and scarce demonstration data. The authors propose DexMulti, a modular framework that decomposes tasks into object-centric skill units with well-defined temporal boundaries. By integrating object geometry–driven skill retrieval, uncertainty-aware estimation of object centroid and yaw angle, and a modular execution mechanism, DexMulti substantially reduces reliance on demonstrations. Using only 3–4 demonstrations per object, the method achieves an average success rate of 66% on training objects over more than 1,000 real-world trials—outperforming diffusion-policy baselines by 2–3×—and demonstrates robust generalization to unseen objects and spatial displacements of up to ±25 cm.

Technology Category

Application Category

📝 Abstract
Dexterous hands enable concurrent prehensile and nonprehensile manipulation, such as holding one object while interacting with another, a capability essential for everyday tasks yet underexplored in robotics. Learning such long-horizon, contact-rich multi-stage behaviors is challenging because demonstrations are expensive to collect and end-to-end policies require substantial data to generalize across varied object geometries and placements. We present DexMulti, a sample-efficient approach for real-world dexterous multi-task manipulation that decomposes demonstrations into object-centric skills with well-defined temporal boundaries. Rather than learning monolithic policies, our method retrieves demonstrated skills based on current object geometry, aligns them to the observed object state using an uncertainty-aware estimator that tracks centroid and yaw, and executes them via a retrieve-align-execute paradigm. We evaluate on three multi-stage tasks requiring concurrent manipulation (Grasp + Pull, Grasp + Open, and Grasp + Grasp) across two dexterous hands (Allegro and LEAP) in over 1,000 real-world trials. Our approach achieves an average success rate of 66% on training objects with only 3-4 demonstrations per object, outperforming diffusion policy baselines by 2-3x while requiring far fewer demonstrations. Results demonstrate robust generalization to held-out objects and spatial variations up to +/-25 cm.
Problem

Research questions and friction points this paper is trying to address.

dexterous manipulation
concurrent prehensile and nonprehensile manipulation
multi-stage tasks
sample-efficient learning
contact-rich manipulation
Innovation

Methods, ideas, or system contributions that make the work stand out.

dexterous manipulation
skill decomposition
retrieve-align-execute
sample-efficient learning
uncertainty-aware alignment
🔎 Similar Papers
No similar papers found.
H
Hao Jiang
Y
Yue Wu
Y
Yue Wang
G
Gaurav S. Sukhatme
Daniel Seita
Daniel Seita
University of Southern California
RoboticsMachine Learning