🤖 AI Summary
To address the dual requirements of low computational complexity and high robustness for acoustic echo and noise reduction (AENR) on consumer-grade devices, this paper proposes a lightweight and efficient architecture. Methodologically, it introduces a novel time-aligned parallel encoder fusion structure coupled with a channel-wise feature redirection mechanism to enable robust cross-scenario modeling; further, it employs a hybrid temporal alignment strategy and a streamlined neural network design. The proposed approach significantly reduces computational and memory overhead while achieving superior echo suppression performance compared to existing state-of-the-art (SOTA) methods and attaining current-best noise suppression results. Contributions include a real-time, generalizable, and deployment-feasible AENR solution tailored for edge devices—establishing a new paradigm for on-device AENR.
📝 Abstract
The successful deployment of deep learning-based acoustic echo and noise reduction (AENR) methods in consumer devices has spurred interest in developing low-complexity solutions, while emphasizing the need for robust performance in real-life applications. In this work, we propose a hybrid approach to enhance the state-of-the-art (SOTA) ULCNet model by integrating time alignment and parallel encoder blocks for the model inputs, resulting in better echo reduction and comparable noise reduction performance to existing SOTA methods. We also propose a channel-wise sampling-based feature reorientation method, ensuring robust performance across many challenging scenarios, while maintaining overall low computational and memory requirements.