π€ AI Summary
This work addresses the challenge of simultaneously achieving compactness, durability, and accurate estimation of slip direction and magnitude in robotic grippingβa limitation of existing tactile sensing approaches. The authors propose a multi-channel acoustic sensing solution that embeds an array of piezoelectric microphones into parallel gripper jaws, paired with textured silicone contact pads to capture structured vibration signals during interaction. A lightweight convolutional network processes multi-channel log-mel spectrograms to enable real-time joint prediction of slip occurrence, direction, and magnitude. By leveraging spatially distributed acoustic perception, the method effectively resolves directional ambiguity, achieving a mean angular error of only 14.1Β° with four microphones. It outperforms baseline methods by up to 12% in detection accuracy and reduces direction and magnitude errors by 64% and 68%, respectively, compared to a single-microphone setup, while being successfully integrated into a closed-loop real-time control system.
π Abstract
Reliable in-hand manipulation requires accurate real-time estimation of slip between a gripper and a grasped object. Existing tactile sensing approaches based on vision, capacitance, or force-torque measurements face fundamental trade-offs in form factor, durability, and their ability to jointly estimate slip direction and magnitude. We present A-SLIP, a multi-channel acoustic sensing system integrated into a parallel-jaw gripper for estimating continuous slip in the grasp plane. The A-SLIP sensor consists of piezoelectric microphones positioned behind a textured silicone contact pad to capture structured contact-induced vibrations. The A-SLIP model processes synchronized multi-channel audio as log-mel spectrograms using a lightweight convolutional network, jointly predicting the presence, direction, and magnitude of slip. Across experiments with robot- and externally induced slip conditions, the fine-tuned four-microphone configuration achieves a mean absolute directional error of 14.1 degrees, outperforms baselines by up to 12 percent in detection accuracy, and reduces directional error by 32 percent. Compared with single-microphone configurations, the multi-channel design reduces directional error by 64 percent and magnitude error by 68 percent, underscoring the importance of spatial acoustic sensing in resolving slip direction ambiguity. We further evaluate A-SLIP in closed-loop reactive control and find that it enables reliable, low-cost, real-time estimation of in-hand slip. Project videos and additional details are available at https://a-slip.github.io.