Canonical Representation and Force-Based Pretraining of 3D Tactile for Dexterous Visuo-Tactile Policy Learning

📅 2024-09-26

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Large-scale labeled data and effective pretraining paradigms are lacking for 3D tactile feature learning. Method: This paper proposes a force-driven self-supervised pretraining framework. Contribution/Results: (1) It introduces the first standardized 3D tactile representation, unifying spatial and force information via normalized coordinate mapping and force-field encoding; (2) it designs a local–global force-aware contrastive self-supervised task to jointly learn spatial–force dual-channel representations; (3) it supports vision–tactile multimodal policy fine-tuning. Evaluated on four real-world fine manipulation tasks, the framework achieves an average success rate of 78%, significantly outperforming prior methods. This work is the first to empirically validate that force-perception-guided tactile representation enhances both effectiveness and generalization in contact-intensive dexterous manipulation.

Technology Category

Application Category

📝 Abstract

Tactile sensing plays a vital role in enabling robots to perform fine-grained, contact-rich tasks. However, the high dimensionality of tactile data, due to the large coverage on dexterous hands, poses significant challenges for effective tactile feature learning, especially for 3D tactile data, as there are no large standardized datasets and no strong pretrained backbones. To address these challenges, we propose a novel canonical representation that reduces the difficulty of 3D tactile feature learning and further introduces a force-based self-supervised pretraining task to capture both local and net force features, which are crucial for dexterous manipulation. Our method achieves an average success rate of 78% across four fine-grained, contact-rich dexterous manipulation tasks in real-world experiments, demonstrating effectiveness and robustness compared to other methods. Further analysis shows that our method fully utilizes both spatial and force information from 3D tactile data to accomplish the tasks. The videos can be viewed at https://3dtacdex.github.io.

Problem

Research questions and friction points this paper is trying to address.

Reduces 3D tactile feature learning difficulty

Introduces force-based pretraining for manipulation

Improves success in dexterous manipulation tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Canonical representation simplifies 3D tactile learning

Force-based pretraining captures essential force features

Method utilizes spatial and force information effectively

🔎 Similar Papers

Low Fidelity Visuo-Tactile Pretraining Improves Vision-Only Manipulation Performance