Spatially anchored Tactile Awareness for Robust Dexterous Manipulation

📅 2025-10-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing vision–tactile learning methods struggle to achieve sub-millimeter dexterous manipulation, primarily because tactile representations are not explicitly aligned with the hand’s kinematic coordinate system—thus failing to exploit the rich spatial information inherent in tactile signals. To address this, we propose Spatially Anchored Tactile Representation (SaTA), the first framework to explicitly anchor tactile measurements to the hand’s coordinate frame, yielding a geometrically interpretable and sensor-faithful tactile representation that enables not only contact detection but also precise local geometric reconstruction of objects. SaTA integrates forward-kinematics-driven tactile feature alignment, end-to-end policy learning, and a multimodal tactile–visual fusion network. Evaluated on USB-C insertion/removal, lightbulb installation, and card sliding tasks, SaTA improves success rates by 30% and reduces task completion time by 27%, significantly advancing model-free learning-based approaches for high-precision dexterous manipulation.

Technology Category

Application Category

📝 Abstract
Dexterous manipulation requires precise geometric reasoning, yet existing visuo-tactile learning methods struggle with sub-millimeter precision tasks that are routine for traditional model-based approaches. We identify a key limitation: while tactile sensors provide rich contact information, current learning frameworks fail to effectively leverage both the perceptual richness of tactile signals and their spatial relationship with hand kinematics. We believe an ideal tactile representation should explicitly ground contact measurements in a stable reference frame while preserving detailed sensory information, enabling policies to not only detect contact occurrence but also precisely infer object geometry in the hand's coordinate system. We introduce SaTA (Spatially-anchored Tactile Awareness for dexterous manipulation), an end-to-end policy framework that explicitly anchors tactile features to the hand's kinematic frame through forward kinematics, enabling accurate geometric reasoning without requiring object models or explicit pose estimation. Our key insight is that spatially grounded tactile representations allow policies to not only detect contact occurrence but also precisely infer object geometry in the hand's coordinate system. We validate SaTA on challenging dexterous manipulation tasks, including bimanual USB-C mating in free space, a task demanding sub-millimeter alignment precision, as well as light bulb installation requiring precise thread engagement and rotational control, and card sliding that demands delicate force modulation and angular precision. These tasks represent significant challenges for learning-based methods due to their stringent precision requirements. Across multiple benchmarks, SaTA significantly outperforms strong visuo-tactile baselines, improving success rates by up to 30 percentage while reducing task completion times by 27 percentage.
Problem

Research questions and friction points this paper is trying to address.

Achieving sub-millimeter precision in dexterous manipulation tasks
Leveraging tactile signals with spatial hand kinematics relationships
Enabling geometric reasoning without object models or pose estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spatially anchors tactile features to hand kinematics
Enables geometric reasoning without object models
Uses forward kinematics for stable tactile representation
🔎 Similar Papers
No similar papers found.
Jialei Huang
Jialei Huang
Phd student, Tsinghua University
Reinforcement learningcomputer vision3d
Y
Yang Ye
Tsinghua University
Y
Yuanqing Gong
Tsinghua University
X
Xuezhou Zhu
Tsinghua University
Y
Yang Gao
Sharpa
Kaifeng Zhang
Kaifeng Zhang
Columbia University
RoboticsPhysics SimulationMachine LearningComputer Vision