🤖 AI Summary
This work addresses the challenge of motion tracking in robotic in-hand manipulation under low-texture tactile conditions, where occlusion and lack of distinctive features hinder reliable estimation. To overcome this, the authors propose a physically interpretable, decoupled motion estimation approach leveraging a dual DM-Tac sensor configuration. The method transforms tactile signals into a 3D force field, estimating translational motion via contact centroid displacement and rotational motion through shear response, thereby enabling incremental rigid-body motion tracking in SE(3). Grounded in SE(3) equivariance, the framework effectively mitigates translation–rotation coupling ambiguities and supports rotation estimation across axes and diverse object geometries. Experiments demonstrate that the approach provides lightweight corrective signals for downstream manipulation tasks, significantly enhancing disturbance rejection and tracking robustness without requiring policy retraining.
📝 Abstract
Robotic in-hand manipulation requires reliable object-motion tracking under frequent visual occlusion, yet low-texture visuotactile images provide few stable correspondences for conventional image- or geometry-matching methods. This paper presents TacSE3, a tactile motion-estimation pipeline that converts low-texture visuotactile observations into a decoupled three-dimensional force field and estimates incremental rigid-body motion on SE(3). The method derives planar translation from contact-centroid motion and estimates rotation primarily from shear-related tactile responses, yielding a physically interpretable signal for in-gripper tracking and compensation. Experiments with paired DM-Tac fingertip sensors show that dual-sensor sensing reduces translation-rotation ambiguity, supports rotation tracking across axes and object geometries, and provides a lightweight compensation signal that improves disturbance tolerance in downstream manipulation tasks without retraining the base policy.