🤖 AI Summary
Robust dexterous manipulation in unstructured home environments remains challenging due to occlusions, visual clutter, and difficulties in fine-grained contact control. To address this, we propose PolyTouch—a novel tri-modal tactile fingertip sensor that uniquely integrates high-resolution camera-based tactile imaging, acoustic vibration sensing, and peripheral vision, enabling dense, multi-timescale contact feedback. We further design a human-demonstration-guided tactile diffusion policy, establishing an end-to-end, tactile-driven perception–action joint model. Our approach transcends conventional vision- or proprioception-only paradigms, significantly improving task success rates and cross-task generalization in contact-intensive manipulation. Experimental results demonstrate over 20× improvement in sensor operational lifetime, validating the critical role of multimodal tactile sensing in enabling dexterous home-service robotics.
📝 Abstract
Achieving robust dexterous manipulation in unstructured domestic environments remains a significant challenge in robotics. Even with state-of-the-art robot learning methods, haptic-oblivious control strategies (i.e. those relying only on external vision and/or proprioception) often fall short due to occlusions, visual complexities, and the need for precise contact interaction control. To address these limitations, we introduce PolyTouch, a novel robot finger that integrates camera-based tactile sensing, acoustic sensing, and peripheral visual sensing into a single design that is compact and durable. PolyTouch provides high-resolution tactile feedback across multiple temporal scales, which is essential for efficiently learning complex manipulation tasks. Experiments demonstrate an at least 20-fold increase in lifespan over commercial tactile sensors, with a design that is both easy to manufacture and scalable. We then use this multi-modal tactile feedback along with visuo-proprioceptive observations to synthesize a tactile-diffusion policy from human demonstrations; the resulting contact-aware control policy significantly outperforms haptic-oblivious policies in multiple contact-aware manipulation policies. This paper highlights how effectively integrating multi-modal contact sensing can hasten the development of effective contact-aware manipulation policies, paving the way for more reliable and versatile domestic robots. More information can be found at https://polytouch.alanz.info/