🤖 AI Summary
This work addresses key limitations of existing distributed speech enhancement methods in wireless acoustic sensor networks, which often suffer from slow convergence, reliance on iterative processing, and the restrictive assumption that all nodes observe the same acoustic source—rendering them impractical for real-world scenarios. To overcome these challenges, the paper proposes a non-iterative distributed multichannel Wiener filter (dMWF) derived under the linear minimum mean-square error criterion. By exchanging low-dimensional fused signals among nodes, the method achieves optimal speech estimation without iteration in a fully connected network. Notably, it accommodates heterogeneous observations where each node may capture different sources, substantially reducing communication overhead. Simulations demonstrate that the proposed approach rapidly outperforms state-of-the-art algorithms such as DANSE and attains objective performance close to that of centralized systems.
📝 Abstract
In a wireless acoustic sensor network (WASN), devices (i.e., nodes) can collaborate through distributed algorithms to collectively perform audio signal processing tasks. This paper focuses on the distributed estimation of node-specific desired speech signals using network-wide Wiener filtering. The objective is to match the performance of a centralized system that would have access to all microphone signals, while reducing the communication bandwidth usage of the algorithm. Existing solutions, such as the distributed adaptive node-specific signal estimation (DANSE) algorithm, converge towards the multichannel Wiener filter (MWF) which solves a centralized linear minimum mean square error (LMMSE) signal estimation problem. However, they do so iteratively, which can be slow and impractical. Many solutions also assume that all nodes observe the same set of sources of interest, which is often not the case in practice. To overcome these limitations, we propose the distributed multichannel Wiener filter (dMWF) for fully connected WASNs. The dMWF is non-iterative and optimal even when nodes observe different sets of sources. In this algorithm, nodes exchange neighbor-pair-specific, low-dimensional (fused) signals estimating the contribution of sources observed by both nodes in the pair. We formally prove the optimality of dMWF and demonstrate its performance in simulated speech enhancement experiments. The proposed algorithm is shown to outperform DANSE in terms of objective metrics after short operation times, highlighting the benefit of its iterationless design.