🤖 AI Summary
To address the challenge of physical spatial perception for visually impaired navigation, this paper proposes an online spatial audio rendering framework leveraging depth sensors. Methodologically, it introduces sensor-centered 2D annular and 3D cylindrical grid representations, integrated with a VDB-Gaussian Process (VDB-GP) implicit distance field for compact, robust, real-time 360° mapping. Additionally, it incorporates room impulse response–driven binaural audio rendering to transform geometric structures into spatially faithful auditory cues. Key contributions include: (i) the first application of VDB-GP to audio-based mapping—balancing dynamic object robustness and computational efficiency; and (ii) a novel grid encoding scheme significantly enhancing audio rendering fidelity and adaptability. Experiments demonstrate superior performance over state-of-the-art methods in mapping accuracy, coverage, real-time operation (>30 FPS), and navigation effectiveness, validated through quantitative evaluation and user studies with visually impaired participants.
📝 Abstract
Robotic perception is becoming a key technology for navigation aids, especially helping individuals with visual impairments through spatial sonification. This paper introduces a mapping representation that accurately captures scene geometry for sonification, turning physical spaces into auditory experiences. Using depth sensors, we encode an incrementally built 3D scene into a compact 360-degree representation with angular and distance information, aligning this way with human auditory spatial perception. The proposed framework performs localisation and mapping via VDB-Gaussian Process Distance Fields for efficient online scene reconstruction. The key aspect is a sensor-centric structure that maintains either a 2D-circular or 3D-cylindrical raster-based projection. This spatial representation is then converted into binaural auditory signals using simple pre-recorded responses from a representative room. Quantitative and qualitative evaluations show improvements in accuracy, coverage, timing and suitability for sonification compared to other approaches, with effective handling of dynamic objects as well. An accompanying video demonstrates spatial sonification in room-like environments. https://tinyurl.com/ListenToYourMap