π€ AI Summary
This study addresses the high annotation cost and low shear-direction prediction accuracy of vision-based tactile sensors (VBTS) across device, illumination, and marker-modality variations. We propose the first transferable framework integrating sequential image translation with recursive force estimation. Methodologically, we employ domain-adaptive image translation (via GANs or VAEs) to bridge source-to-target tactile image domains, model temporal dynamics using RNNs/LSTMs, and enforce separate representations for RGB and binary marker modalities. We empirically find that the binary marker modality better supports shear-force (x/y-axis) prediction, whereas RGB excels at normal-force (z-axis) estimation. Experiments demonstrate that our framework achieves mean absolute errors of 0.69 N, 0.70 N, and 1.11 N on the x-, y-, and z-axes, respectively (corresponding relative errors: 5.8%, 5.8%, 6.9%), significantly outperforming single-frame baseline models.
π Abstract
Vision-based tactile sensors (VBTSs) provide high-resolution tactile images crucial for robot in-hand manipulation. However, force sensing in VBTSs is underutilized due to the costly and time-intensive process of acquiring paired tactile images and force labels. In this study, we introduce a transferable force prediction model, TransForce, designed to leverage collected image-force paired data for new sensors under varying illumination colors and marker patterns while improving the accuracy of predicted forces, especially in the shear direction. Our model effectively achieves translation of tactile images from the source domain to the target domain, ensuring that the generated tactile images reflect the illumination colors and marker patterns of the new sensors while accurately aligning the elastomer deformation observed in existing sensors, which is beneficial to force prediction of new sensors. As such, a recurrent force prediction model trained with generated sequential tactile images and existing force labels is employed to estimate higher-accuracy forces for new sensors with lowest average errors of 0.69N (5.8% in full work range) in $x$-axis, 0.70N (5.8%) in $y$-axis, and 1.11N (6.9%) in $z$-axis compared with models trained with single images. The experimental results also reveal that pure marker modality is more helpful than the RGB modality in improving the accuracy of force in the shear direction, while the RGB modality show better performance in the normal direction.