đ€ AI Summary
Conventional short-time Fourier transform (STFT) performance critically depends on hand-crafted or heuristic hyperparametersâsuch as window length, hop size, and overlap ratioâwhile existing discrete grid-search optimization methods suffer from high computational cost and lack task-specific adaptability.
Method: We propose the first fully differentiable STFT framework, modeling core STFT operationsâincluding window function selection, overlap ratio, and discrete Fourier transformâas end-to-end trainable, differentiable signal processing modules. This enables joint optimization with downstream neural networks via gradient-based learning.
Contribution/Results: By transcending the limitations of discrete parameter spaces, our approach achieves gradient-driven, adaptive time-frequency representation learning. Extensive experiments on synthetic and real-world signals demonstrate substantial improvements in time-frequency resolution and consistent performance gains across downstream tasksâincluding classification and denoisingâvalidating both the effectiveness and generalizability of differentiable time-frequency analysis.
đ Abstract
The short-time Fourier transform (STFT) is widely used for analyzing non-stationary signals. However, its performance is highly sensitive to its parameters, and manual or heuristic tuning often yields suboptimal results. To overcome this limitation, we propose a unified differentiable formulation of the STFT that enables gradient-based optimization of its parameters. This approach addresses the limitations of traditional STFT parameter tuning methods, which often rely on computationally intensive discrete searches. It enables fine-tuning of the time-frequency representation (TFR) based on any desired criterion. Moreover, our approach integrates seamlessly with neural networks, allowing joint optimization of the STFT parameters and network weights. The efficacy of the proposed differentiable STFT in enhancing TFRs and improving performance in downstream tasks is demonstrated through experiments on both simulated and real-world data.