Towards Generalization of Tactile Image Generation: Reference-Free Evaluation in a Leakage-Free Setting

📅 2025-03-10

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

Tactile image generation faces two key challenges: (1) poor generalization to unseen materials and inadequate modeling of subtle contact features; and (2) inflated performance estimates due to data leakage between training and test sets in existing evaluation protocols. To address these, we propose a strict, zero-data-leakage evaluation protocol and introduce the first reference-free tactile generation evaluation framework—TMMD metrics—comprising four distributional distance measures: TMMD, I-TMMD, CI-TMMD, and D-TMMD. Furthermore, we design a cross-modal generation paradigm that uses material text descriptions as an intermediate modality, enabling text-guided, vision–touch disentangled modeling. Evaluated on the Touch and Go and HCT datasets under leakage-isolated settings, our method achieves significant improvements in generalization performance and, for the first time, systematically validates models’ capacity to capture authentic tactile characteristics.

Technology Category

Application Category

📝 Abstract

Tactile sensing, which relies on direct physical contact, is critical for human perception and underpins applications in computer vision, robotics, and multimodal learning. Because tactile data is often scarce and costly to acquire, generating synthetic tactile images provides a scalable solution to augment real-world measurements. However, ensuring robust generalization in synthesizing tactile images-capturing subtle, material-specific contact features-remains challenging. We demonstrate that overlapping training and test samples in commonly used datasets inflate performance metrics, obscuring the true generalizability of tactile models. To address this, we propose a leakage-free evaluation protocol coupled with novel, reference-free metrics-TMMD, I-TMMD, CI-TMMD, and D-TMMD-tailored for tactile generation. Moreover, we propose a vision-to-touch generation method that leverages text as an intermediate modality by incorporating concise, material-specific descriptions during training to better capture essential tactile features. Experiments on two popular visuo-tactile datasets, Touch and Go and HCT, show that our approach achieves superior performance and enhanced generalization in a leakage-free setting.

Problem

Research questions and friction points this paper is trying to address.

Ensures robust generalization in tactile image synthesis

Addresses dataset leakage inflating tactile model performance metrics

Proposes a vision-to-touch method using text for better tactile feature capture

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leakage-free evaluation protocol for tactile models

Reference-free metrics tailored for tactile generation

Vision-to-touch method using text as intermediate modality

🔎 Similar Papers

TextToucher: Fine-Grained Text-to-Touch Generation

2024-09-09arXiv.orgCitations: 3

Bosch Group

Renningen, BW, DE

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)