🤖 AI Summary
This work addresses the challenge of performing high-precision tactile manipulation tasks—such as key insertion—under environmental uncertainty, where robots often struggle. The authors propose a novel approach integrating a soft robotic wrist with a retrieval-based tactile memory control framework. Central to this method is the Masked Tactile Trajectory Transformer (MAT³), which, for the first time, incorporates a masked encoding mechanism into tactile trajectory modeling to jointly capture spatiotemporal dependencies among actions, distributed tactile signals, torques, and proprioceptive feedback. The model adaptively extracts task-relevant features without requiring explicit subtask segmentation, enabling safe exploration and efficient reuse of past experiences. Extensive real-robot experiments on diverse insertion tasks demonstrate significantly higher success rates compared to baseline methods and strong generalization to unseen plugs and environmental conditions.
📝 Abstract
Tactile memory, the ability to store and retrieve touch-based experience, is critical for contact-rich tasks such as key insertion under uncertainty. To replicate this capability, we introduce Tactile Memory with Soft Robot (TaMeSo-bot), a system that integrates a soft wrist with tactile retrieval-based control to enable safe and robust manipulation. The soft wrist allows safe contact exploration during data collection, while tactile memory reuses past demonstrations via retrieval for flexible adaptation to unseen scenarios. The core of this system is the Masked Tactile Trajectory Transformer (MAT$^\text{3}$), which jointly models spatiotemporal interactions between robot actions, distributed tactile feedback, force-torque measurements, and proprioceptive signals. Through masked-token prediction, MAT$^\text{3}$ learns rich spatiotemporal representations by inferring missing sensory information from context, autonomously extracting task-relevant features without explicit subtask segmentation. We validate our approach on peg-in-hole tasks with diverse pegs and conditions in real-robot experiments. Our extensive evaluation demonstrates that MAT$^\text{3}$ achieves higher success rates than the baselines over all conditions and shows remarkable capability to adapt to unseen pegs and conditions.