🤖 AI Summary
To address the challenge of single-microphone sound source localization for mobile robots in highly reverberant environments, this paper proposes a lightweight online localization method. It employs a compact temporal feature extraction network—containing only 43k parameters—to estimate sound source distance in real time directly from single-channel reverberant signals. This distance estimate is fused with robot motion data via an extended Kalman filter (EKF) to jointly and recursively estimate both robot pose and sound source position. To our knowledge, this is the first method achieving stable, real-time online localization under the triple constraints of single microphone, mobile platform, and strong reverberation. Experiments in realistic acoustic environments demonstrate centimeter-level distance estimation accuracy. The implementation is open-sourced and validated for computational efficiency, low latency (<10 ms per frame), and robustness across diverse reverberant conditions, thereby bridging a critical gap in dynamic sound source localization with single-microphone systems.
📝 Abstract
Accurately estimating sound source positions is crucial for robot audition. However, existing sound source localization methods typically rely on a microphone array with at least two spatially preconfigured microphones. This requirement hinders the applicability of microphone-based robot audition systems and technologies. To alleviate these challenges, we propose an online sound source localization method that uses a single microphone mounted on a mobile robot in reverberant environments. Specifically, we develop a lightweight neural network model with only 43k parameters to perform real-time distance estimation by extracting temporal information from reverberant signals. The estimated distances are then processed using an extended Kalman filter to achieve online sound source localization. To the best of our knowledge, this is the first work to achieve online sound source localization using a single microphone on a moving robot, a gap that we aim to fill in this work. Extensive experiments demonstrate the effectiveness and merits of our approach. To benefit the broader research community, we have open-sourced our code at https://github.com/JiangWAV/single-mic-SSL.