BSDB-Net: Band-Split Dual-Branch Network with Selective State Spaces Mechanism for Monaural Speech Enhancement

📅 2024-12-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the compensation effect induced by amplitude-phase coupling and excessive model complexity in single-channel speech enhancement, this paper proposes a dual-branch network integrating band-splitting and Mamba-based state-space modeling. The method introduces a novel amplitude-phase decoupled dual-branch interaction architecture that enables independent optimization of magnitude and phase via complex spectral modeling. Band-splitting is employed to compress the frequency-domain dimensionality, while a lightweight Mamba-driven time-frequency joint modeling module is designed to achieve linear-complexity sequence modeling. Experiments demonstrate that the proposed approach reduces computational complexity by 8.3× on average over mainstream methods and by 25× compared to Transformer-based models, while maintaining state-of-the-art performance on PESQ, STOI, and other metrics—achieving significant improvements in both efficiency and generalization.

Technology Category

Application Category

📝 Abstract
Although the complex spectrum-based speech enhancement(SE) methods have achieved significant performance, coupling amplitude and phase can lead to a compensation effect, where amplitude information is sacrificed to compensate for the phase that is harmful to SE. In addition, to further improve the performance of SE, many modules are stacked onto SE, resulting in increased model complexity that limits the application of SE. To address these problems, we proposed a dual-path network based on compressed frequency using Mamba. First, we extract amplitude and phase information through parallel dual branches. This approach leverages structured complex spectra to implicitly capture phase information and solves the compensation effect by decoupling amplitude and phase, and the network incorporates an interaction module to suppress unnecessary parts and recover missing components from the other branch. Second, to reduce network complexity, the network introduces a band-split strategy to compress the frequency dimension. To further reduce complexity while maintaining good performance, we designed a Mamba-based module that models the time and frequency dimensions under linear complexity. Finally, compared to baselines, our model achieves an average 8.3 times reduction in computational complexity while maintaining superior performance. Furthermore, it achieves a 25 times reduction in complexity compared to transformer-based models.
Problem

Research questions and friction points this paper is trying to address.

Monoaural Sound Clarification
Size and Angle Interference
System Complexity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-Path Processing
Mamba Technique
Audio Band-Splitting
🔎 Similar Papers
No similar papers found.
C
Cunhang Fan
Anhui Province Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, China
E
Enrui Liu
Anhui Province Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, China
A
Andong Li
Key Laboratory of Noise and Vibration Research, Institute of Acoustics Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
J
Jianhua Tao
Department of Automation, Tsinghua University, Beijing, China
J
Jian Zhou
Anhui Province Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, China
J
Jiahao Li
Anhui Province Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, China
Chengshi Zheng
Chengshi Zheng
Institute of Acoustics, Chinese Academy of Sciences
Speech enhancementmicrophone arraydeep learning
Z
Zhao Lv
Anhui Province Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, China