🤖 AI Summary
To address the challenge of capturing critical tactile signals from visual data in contact-intensive manipulation tasks, this paper proposes a human-demonstration-driven robotic learning framework based on the isomorphic tactile glove OSMO. OSMO integrates twelve triaxial flexible sensors distributed across fingertips and the palm, enabling in-the-wild natural manipulation data collection and, for the first time, direct zero-shot transfer of continuous normal and shear force signals between humans and robots—without image inpainting or vision-based force estimation. Its lightweight hardware design ensures compatibility with mainstream hand-tracking systems, while the algorithm employs an end-to-end tactile-to-action mapping strategy. Evaluated on a real-world wiping task, the framework achieves 72% success rate—significantly outperforming vision-only baselines—and requires no real-robot interaction data. The core contributions are: (1) the first open-source isomorphic wearable tactile device, and (2) a novel zero-shot tactile policy transfer paradigm.
📝 Abstract
Human video demonstrations provide abundant training data for learning robot policies, but video alone cannot capture the rich contact signals critical for mastering manipulation. We introduce OSMO, an open-source wearable tactile glove designed for human-to-robot skill transfer. The glove features 12 three-axis tactile sensors across the fingertips and palm and is designed to be compatible with state-of-the-art hand-tracking methods for in-the-wild data collection. We demonstrate that a robot policy trained exclusively on human demonstrations collected with OSMO, without any real robot data, is capable of executing a challenging contact-rich manipulation task. By equipping both the human and the robot with the same glove, OSMO minimizes the visual and tactile embodiment gap, enabling the transfer of continuous shear and normal force feedback while avoiding the need for image inpainting or other vision-based force inference. On a real-world wiping task requiring sustained contact pressure, our tactile-aware policy achieves a 72% success rate, outperforming vision-only baselines by eliminating contact-related failure modes. We release complete hardware designs, firmware, and assembly instructions to support community adoption.