Déréverbération non-supervisée de la parole par modèle hybride

📅 2025-10-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the fundamental challenge in unsupervised speech dereverberation: the absence of paired clean/reverberant speech data. We propose a novel training paradigm that relies solely on reverberant speech and a small set of readily available acoustic parameters—most notably RT60. Methodologically, we design a hybrid neural network architecture explicitly incorporating RT60 priors and formulate an end-to-end unsupervised objective via self-supervised signal modeling, eliminating the need for clean-speech supervision. Our key contribution lies in the explicit integration of sparse yet easily measurable reverberation parameters into the training process, thereby enhancing generalization to unseen reverberant conditions and improving robustness. Experiments demonstrate that our approach significantly outperforms existing state-of-the-art unsupervised methods across standard metrics—including PESQ and STOI—and exhibits superior cross-dataset generalization stability.

Technology Category

Application Category

📝 Abstract
This paper introduces a new training strategy to improve speech dereverberation systems in an unsupervised manner using only reverberant speech. Most existing algorithms rely on paired dry/reverberant data, which is difficult to obtain. Our approach uses limited acoustic information, like the reverberation time (RT60), to train a dereverberation system. Experimental results demonstrate that our method achieves more consistent performance across various objective metrics than the state-of-the-art.
Problem

Research questions and friction points this paper is trying to address.

Developing unsupervised speech dereverberation without clean data
Overcoming reliance on paired dry and reverberant speech data
Training systems using limited acoustic information like RT60
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised training strategy for speech dereverberation
Uses only reverberant speech without paired data
Employs limited acoustic information like RT60
🔎 Similar Papers
No similar papers found.