AnyTouch 2: General Optical Tactile Representation Learning For Dynamic Tactile Perception

๐Ÿ“… 2026-02-10
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the limitation of existing tactile learning methods, which predominantly focus on object-level static properties and struggle to model fine-grained force and deformation dynamics during physical interactions. To bridge this gap, the authors introduce ToucHD, a large-scale hierarchical tactile dataset that, for the first time, encompasses atomic actions, real-world manipulations, and aligned forceโ€“tactile pairs. They further propose the AnyTouch 2 framework, which enables universal dynamic tactile representation learning across sensors by jointly modeling multi-frame deformations and explicit force dynamics. This approach unifies object-level understanding with fine-grained force perception, achieving strong performance in static attribute recognition, dynamic physical property estimation, and multi-level manipulation tasks. The method significantly enhances model generalization and robustness across diverse optical tactile sensors and complex interaction scenarios.

Technology Category

Application Category

๐Ÿ“ Abstract
Real-world contact-rich manipulation demands robots to perceive temporal tactile feedback, capture subtle surface deformations, and reason about object properties as well as force dynamics. Although optical tactile sensors are uniquely capable of providing such rich information, existing tactile datasets and models remain limited. These resources primarily focus on object-level attributes (e.g., material) while largely overlooking fine-grained tactile temporal dynamics during physical interactions. We consider that advancing dynamic tactile perception requires a systematic hierarchy of dynamic perception capabilities to guide both data collection and model design. To address the lack of tactile data with rich dynamic information, we present ToucHD, a large-scale hierarchical tactile dataset spanning tactile atomic actions, real-world manipulations, and touch-force paired data. Beyond scale, ToucHD establishes a comprehensive tactile dynamic data ecosystem that explicitly supports hierarchical perception capabilities from the data perspective. Building on it, we propose AnyTouch 2, a general tactile representation learning framework for diverse optical tactile sensors that unifies object-level understanding with fine-grained, force-aware dynamic perception. The framework captures both pixel-level and action-specific deformations across frames, while explicitly modeling physical force dynamics, thereby learning multi-level dynamic perception capabilities from the model perspective. We evaluate our model on benchmarks that covers static object properties and dynamic physical attributes, as well as real-world manipulation tasks spanning multiple tiers of dynamic perception capabilities-from basic object-level understanding to force-aware dexterous manipulation. Experimental results demonstrate consistent and strong performance across sensors and tasks.
Problem

Research questions and friction points this paper is trying to address.

dynamic tactile perception
optical tactile sensors
tactile temporal dynamics
force dynamics
hierarchical perception
Innovation

Methods, ideas, or system contributions that make the work stand out.

dynamic tactile perception
hierarchical tactile dataset
optical tactile sensors
force-aware representation learning
tactile representation learning
๐Ÿ”Ž Similar Papers
No similar papers found.
Ruoxuan Feng
Ruoxuan Feng
Renmin University of China
Embodied AIMulti-modal Learning
Y
Yuxuan Zhou
Beijing Jiaotong University
S
Siyu Mei
Beijing Jiaotong University
Dongzhan Zhou
Dongzhan Zhou
Researcher at Shanghai AI Lab
AI4Sciencecomputer visiondeep learning
Pengwei Wang
Pengwei Wang
University of Calgary
Computer Science Security
S
Shaowei Cui
Institute of Automation, Chinese Academy of Sciences; Beijing Academy of Artificial Intelligence
Bin Fang
Bin Fang
Beijing University of Posts and Telecommunications /Tsinghua University
Robotics and AI
G
Guocai Yao
Beijing Academy of Artificial Intelligence; State Key Laboratory of Multimedia Information Processing, Peking University
Di Hu
Di Hu
Tenure-track Associate Professor, Renmin University of China
Multimodal PerceptionMultimodal LearningMultimodal Interaction