π€ AI Summary
This work addresses the underutilization of haptic information in robotic imitation learning, focusing on high-contact-density tasks such as match striking. We propose the first end-to-end trainable visionβtouch imitation learning framework tailored for dynamic contact manipulation. Our method integrates a high-frame-rate GelSight tactile sensor with an RGB camera, and introduces a multimodal feature alignment and joint encoding mechanism, coupled with behavior cloning for policy learning. Key contributions include: (1) the first systematic empirical validation that tactile signals critically enhance policy generalization and robustness; and (2) a lightweight, differentiable visuotactile fusion architecture. Evaluated on a real robotic platform, our approach achieves over 40% higher task success rate compared to vision-only baselines, demonstrating the indispensable role of tactile feedback in rapid, dexterous contact-rich manipulation.
π Abstract
The field of robotic manipulation has advanced significantly in the last years. At the sensing level, several novel tactile sensors have been developed, capable of providing accurate contact information. On a methodological level, learning from demonstrations has proven an efficient paradigm to obtain performant robotic manipulation policies. The combination of both holds the promise to extract crucial contact-related information from the demonstration data and actively exploit it during policy rollouts. However, despite its potential, it remains an underexplored direction. This work therefore proposes a multimodal, visuotactile imitation learning framework capable of efficiently learning fast and dexterous manipulation policies. We evaluate our framework on the dynamic, contact-rich task of robotic match lighting - a task in which tactile feedback influences human manipulation performance. The experimental results show that adding tactile information into the policies significantly improves performance by over 40%, thereby underlining the importance of tactile sensing for contact-rich manipulation tasks. Project website: https://sites.google.com/view/tactile-il .