FlowTouch: View-Invariant Visuo-Tactile Prediction

📅 2026-03-09

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work addresses the limitation of tactile sensors, which only provide data upon contact and thus struggle to support planning and initial phases of manipulation tasks. To overcome this, the authors propose an end-to-end vision-to-tactile prediction framework that first reconstructs a 3D scene to extract local object meshes, encoding their geometric and surface properties into a viewpoint-invariant representation. Leveraging a Flow Matching generative model, the method predicts tactile images across varying viewpoints and different sensor instances. By effectively decoupling scene-dependent details, the approach significantly enhances sim-to-real transferability, demonstrating strong generalization to novel tactile sensors and real-world environments. Furthermore, it successfully enables downstream tasks such as grasp stability prediction.

Technology Category

Application Category

📝 Abstract

Tactile sensation is essential for contact-rich manipulation tasks. It provides direct feedback on object geometry, surface properties, and interaction forces, enhancing perception and enabling fine-grained control. An inherent limitation of tactile sensors is that readings are available only when an object is touched. This precludes their use during planning and the initial execution phase of a task. Predicting tactile information from visual information can bridge this gap. A common approach is to learn a direct mapping from camera images to the output of vision-based tactile sensors. However, the resulting model will depend strongly on the specific setup and on how well the camera can capture the area where an object is touched. In this work, we introduce FlowTouch, a novel model for view-invariant visuo-tactile prediction. Our key idea is to use an object's local 3D mesh to encode rich information for predicting tactile patterns while abstracting away from scene-dependent details. FlowTouch integrates scene reconstruction and Flow Matching-based models for image generation. Our results show that FlowTouch is able to bridge the sim-to-real gap and generalize to new sensor instances. We further show that the resulting tactile images can be used for downstream grasp stability prediction. Our code, datasets and videos are available at https://flowtouch.github.io/

Problem

Research questions and friction points this paper is trying to address.

visuo-tactile prediction

view-invariance

tactile sensing

contact-rich manipulation

sim-to-real transfer

Innovation

Methods, ideas, or system contributions that make the work stand out.

view-invariant

visuo-tactile prediction

3D mesh representation

Flow Matching

sim-to-real transfer

🔎 Similar Papers

No similar papers found.