🤖 AI Summary
To address the fundamental trade-off between noise suppression and speech distortion in multi-channel speech enhancement, this paper proposes a lightweight end-to-end neural approach that controls a parametric multi-channel Wiener filter (PMWF) via predicted frequency-domain parameters. Unlike conventional fixed-parameter filters or black-box deep learning models, our method employs a compact neural network to estimate only the essential PMWF control parameters—preserving the filter’s physical interpretability and enabling explicit distortion control—while fully leveraging deep learning’s capacity for complex noise modeling. The architecture is designed for low latency and minimal computational complexity, making it suitable for real-time embedded deployment. Experimental results demonstrate that, under equivalent computational budgets, our method significantly outperforms multiple state-of-the-art baselines in objective metrics (PESQ, STOI) and subjective listening quality. To the best of our knowledge, this work represents the first effective neural parameterization of PMWF achieving joint optimization of efficiency and perceptual speech quality.
📝 Abstract
Noise suppression and speech distortion are two important aspects to be balanced when designing multi-channel Speech Enhancement (SE) algorithms. Although neural network models have achieved state-of-the-art noise suppression, their non-linear operations often introduce high speech distortion. Conversely, classical signal processing algorithms such as the Parameterized Multi-channel Wiener Filter ( PMWF) beamformer offer explicit mechanisms for controlling the suppression/distortion trade-off. In this work, we present NeuralPMWF, a system where the PMWF is entirely controlled using a low-latency, low-compute neural network, resulting in a low-complexity system offering high noise reduction and low speech distortion. Experimental results show that our proposed approach results in significantly better perceptual and objective speech enhancement in comparison to several competitive baselines using similar computational resources.