π€ AI Summary
This work addresses the performance limitations of microphone arrays in practical applications caused by the scarcity of densely sampled room impulse responses (RIRs). To this end, it introduces the first diffusion modelβbased framework for RIR interpolation. The proposed method adapts and extends diffusion mechanisms from image inpainting to one-dimensional acoustic impulse response signals, enabling high-fidelity synthesis of RIRs at missing spatial locations. Experimental results on real-world RIR datasets demonstrate that the approach robustly accomplishes interpolation tasks and significantly enhances the performance of multi-microphone speech enhancement and spatial audio processing systems. These findings validate the efficacy and practical utility of diffusion models in realistic acoustic scenarios.
π Abstract
Room Impulse Responses estimation is a fundamental problem in spatial audio processing and speech enhancement. In this paper, we build upon our previously introduced diffusion-based inpainting framework for Room Impulse Response interpolation and demonstrate its applicability to enhancing the performance of practical multi-microphone array processing tasks. Furthermore, we validate the robustness of this method in interpolating real-world Room Impulse Responses.