π€ AI Summary
This work addresses the challenge of recovering audio from micro-vibrations under multi-source acoustic coexistence and environmental interference. We propose the first event-camera-based optical vibration sensing framework. Methodologically, it integrates laser active illumination, high-speed dynamic capture via an event camera (with microsecond-level temporal resolution), and an adaptive signal reconstruction algorithm to achieve highly sensitive, low-latency detection of sub-pixel, visually imperceptible vibrations. Our key contribution is the pioneering application of event cameras to optical vibration sensing, enabling simultaneous multi-source separation and robust audio reconstruction. Experiments demonstrate accurate recovery of multiple concurrently active sound sources even under strong interference; reconstructed audio quality matches state-of-the-art methods, while processing speed exceeds 30 FPSβenabling near real-time operation. The framework significantly enhances the vibration resolution capability and audio fidelity of active perception systems in dynamic, cluttered environments.
π Abstract
Small vibrations observed in video can unveil information beyond what is visual, such as sound and material properties. It is possible to passively record these vibrations when they are visually perceptible, or actively amplify their visual contribution with a laser beam when they are not perceptible. In this paper, we improve upon the active sensing approach by leveraging event-based cameras, which are designed to efficiently capture fast motion. We demonstrate our method experimentally by recovering audio from vibrations, even for multiple simultaneous sources, and in the presence of environmental distortions. Our approach matches the state-of-the-art reconstruction quality at much faster speeds, approaching real-time processing.