Grasp2Grasp: Vision-Based Dexterous Grasp Translation via Schr""odinger Bridges

📅 2025-06-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses zero-shot, vision-driven grasp intention transfer across morphologically distinct robotic hands: given only images of grasps executed by a source hand, it synthesizes physically feasible, functionally equivalent grasp poses for a target hand—without paired data or simulation. Methodologically, it pioneers the application of Schrödinger Bridge theory to grasp transfer, formulating a physics-aware multimodal cost function encompassing base pose, contact maps, wrench space, and manipulability. The framework jointly optimizes latent-space mapping via hybrid score- and flow-matching, vision-conditioned generative modeling, stochastic optimal transport, and multiple physical constraints. Experiments demonstrate that the method generates stable, high-success-rate grasps across diverse hand-object combinations, significantly outperforming supervised and reinforcement learning baselines. It achieves true semantic-level, zero-shot generalization across hand morphologies.

Technology Category

Application Category

📝 Abstract
We propose a new approach to vision-based dexterous grasp translation, which aims to transfer grasp intent across robotic hands with differing morphologies. Given a visual observation of a source hand grasping an object, our goal is to synthesize a functionally equivalent grasp for a target hand without requiring paired demonstrations or hand-specific simulations. We frame this problem as a stochastic transport between grasp distributions using the Schr""odinger Bridge formalism. Our method learns to map between source and target latent grasp spaces via score and flow matching, conditioned on visual observations. To guide this translation, we introduce physics-informed cost functions that encode alignment in base pose, contact maps, wrench space, and manipulability. Experiments across diverse hand-object pairs demonstrate our approach generates stable, physically grounded grasps with strong generalization. This work enables semantic grasp transfer for heterogeneous manipulators and bridges vision-based grasping with probabilistic generative modeling.
Problem

Research questions and friction points this paper is trying to address.

Transfer grasp intent across different robotic hands
Synthesize functionally equivalent grasps without paired demonstrations
Bridge vision-based grasping with probabilistic generative modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Schrödinger Bridge for grasp translation
Score and flow matching in latent spaces
Physics-informed cost functions guide alignment
🔎 Similar Papers
No similar papers found.
T
Tao Zhong
Princeton University
J
Jonah Buchanan
San Jose State University, Lockheed Martin Corporation
Christine Allen-Blanchette
Christine Allen-Blanchette
Assistant Professor, Princeton University
Computer vision