Differentiable Time-Varying IIR Filtering for Real-Time Speech Denoising

📅 2026-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of simultaneously achieving dynamic adaptation to non-stationary noise, low latency, and model interpretability in real-time speech denoising. The authors propose a fully interpretable, end-to-end speech enhancement architecture that combines the explicit modeling strengths of digital signal processing with the adaptive capabilities of deep learning. Specifically, a lightweight neural network is employed to predict time-varying coefficients of a 35-band differentiable cascaded IIR filter in real time, enabling explicit spectral shaping. Experiments on the Valentini-Botinhao dataset demonstrate that the proposed method significantly outperforms both static DDSP baselines and purely data-driven deep learning approaches under dynamic noise conditions, while maintaining low latency, high adaptability, and strong interpretability.

Technology Category

Application Category

📝 Abstract
We present TVF (Time-Varying Filtering), a low-latency speech enhancement model with 1 million parameters. Combining the interpretability of Digital Signal Processing (DSP) with the adaptability of deep learning, TVF bridges the gap between traditional filtering and modern neural speech modeling. The model utilizes a lightweight neural network backbone to predict the coefficients of a differentiable 35-band IIR filter cascade in real time, allowing it to adapt dynamically to non-stationary noise. Unlike ``black-box'' deep learning approaches, TVF offers a completely interpretable processing chain, where spectral modifications are explicit and adjustable. We demonstrate the efficacy of this approach on a speech denoising task using the Valentini-Botinhao dataset and compare the results to a static DDSP approach and a fully deep-learning-based solution, showing that TVF achieves effective adaptation to changing noise conditions.
Problem

Research questions and friction points this paper is trying to address.

speech denoising
time-varying filtering
non-stationary noise
real-time processing
interpretable model
Innovation

Methods, ideas, or system contributions that make the work stand out.

differentiable IIR filtering
time-varying filtering
speech denoising
interpretable deep learning
real-time audio processing
🔎 Similar Papers
No similar papers found.
R
Riccardo Rota
Logitech Europe S.A., Switzerland; EPFL (École Polytechnique Fédérale de Lausanne), Switzerland
K
Kiril Ratmanski
Logitech Europe S.A., Switzerland
J
Jozef Coldenhoff
Logitech Europe S.A., Switzerland
Milos Cernak
Milos Cernak
Logitech, EPFL - Quartier de l'Innovation
Meeting SpeechSpeech Analysis-Synthesis and CodingPathological Speech ProcessingArtificial Intelligence