FingerEye: Continuous and Unified Vision-Tactile Sensing for Dexterous Manipulation

📅 2026-04-22
📈 Citations: 0
Influential: 0
📄 PDF

career value

245K/year
🤖 AI Summary
This work proposes FingerEye, a compact and low-cost visuo-tactile sensor that enables continuous perception throughout the entire robotic manipulation pipeline. Unlike conventional tactile sensors that provide feedback only upon contact, FingerEye leverages binocular close-range RGB vision to implicitly estimate depth prior to contact and, post-contact, infers contact wrenches through deformation of a soft annular structure combined with marker-based pose estimation, thereby achieving a seamless transition from visual to tactile sensing. Integrated with multimodal imitation learning and a digital twin framework, the system learns robust and dexterous manipulation policies from only a few real-world demonstrations. This approach is the first to deliver unified, continuous perception across all phases of manipulation, significantly enhancing generalization to variations in object appearance and physical properties, as demonstrated in tasks such as coin uprighting and chip pick-and-place.

Technology Category

Application Category

📝 Abstract
Dexterous robotic manipulation requires comprehensive perception across all phases of interaction: pre-contact, contact initiation, and post-contact. Such continuous feedback allows a robot to adapt its actions throughout interaction. However, many existing tactile sensors, such as GelSight and its variants, only provide feedback after contact is established, limiting a robot's ability to precisely initiate contact. We introduce FingerEye, a compact and cost-effective sensor that provides continuous vision-tactile feedback throughout the interaction process. FingerEye integrates binocular RGB cameras to provide close-range visual perception with implicit stereo depth. Upon contact, external forces and torques deform a compliant ring structure; these deformations are captured via marker-based pose estimation and serve as a proxy for contact wrench sensing. This design enables a perception stream that smoothly transitions from pre-contact visual cues to post-contact tactile feedback. Building on this sensing capability, we develop a vision-tactile imitation learning policy that fuses signals from multiple FingerEye sensors to learn dexterous manipulation behaviors from limited real-world data. We further develop a digital twin of our sensor and robot platform to improve policy generalization. By combining real demonstrations with visually augmented simulated observations for representation learning, the learned policies become more robust to object appearance variations. Together, these design aspects enable dexterous manipulation across diverse object properties and interaction regimes, including coin standing, chip picking, letter retrieving, and syringe manipulation. The hardware design, code, appendix, and videos are available on our project website: https://nus-lins-lab.github.io/FingerEyeWeb/
Problem

Research questions and friction points this paper is trying to address.

dexterous manipulation
continuous perception
vision-tactile sensing
contact initiation
robotic interaction
Innovation

Methods, ideas, or system contributions that make the work stand out.

vision-tactile sensing
continuous perception
dexterous manipulation
digital twin
imitation learning
🔎 Similar Papers