ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML Integration

📅 2025-05-12

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

This work addresses the lack of invertible, stable, and perceptually grounded filter banks in machine learning. We propose ISAC: an invertible, strictly stable, complex wavelet-based filter bank enabling perfect reconstruction. ISAC is the first framework to jointly realize auditory-scale mapping (e.g., ERB), user-specified time-domain support, jointly tunable center frequencies and bandwidths, and end-to-end differentiability. Its analysis–synthesis pair is constructed from parameterized FIR convolutional kernels, ensuring native compatibility with deep learning frameworks such as PyTorch. Experiments demonstrate that ISAC significantly improves model robustness and generalization in speech enhancement and source separation tasks. Moreover, it supports zero-latency real-time processing and lossless signal reconstruction. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract

This paper introduces ISAC, an invertible and stable, perceptually-motivated filter bank that is specifically designed to be integrated into machine learning paradigms. More precisely, the center frequencies and bandwidths of the filters are chosen to follow a non-linear, auditory frequency scale, the filter kernels have user-defined maximum temporal support and may serve as learnable convolutional kernels, and there exists a corresponding filter bank such that both form a perfect reconstruction pair. ISAC provides a powerful and user-friendly audio front-end suitable for any application, including analysis-synthesis schemes.

Problem

Research questions and friction points this paper is trying to address.

Designing an invertible auditory filter bank for ML integration

Customizing filter kernels with user-defined temporal support

Ensuring perfect reconstruction in analysis-synthesis audio schemes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Invertible and stable auditory filter bank

Customizable non-linear auditory frequency scale

Learnable convolutional kernels for ML

🔎 Similar Papers

Mitigating Low-Frequency Bias: Feature Recalibration and Frequency Attention Regularization for Adversarial Robustness

2024-07-04arXiv.orgCitations: 0

Cohere

Toronto, San Francisco, New York City, London, Paris, Montreal, Seoul, Germany, PST, EST

Authors to Follow