Interpolation filter design for sample rate independent audio effect RNNs

πŸ“… 2024-09-24
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 1
✨ Influential: 1
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the inflexibility of RNN-based audio effect models (e.g., guitar distortion) in variable-rate resampling, stemming from weight tying constrained to a fixed sampling rate. To overcome this, we propose an extrapolation-filter-based undersampling adaptation method. Our key contributions are threefold: (i) the first design of a signal-lead extrapolation filter tailored for RNN hidden state sequences; (ii) a linearized stability theory framework grounded in Jacobian fixed-point analysis, enabling *a priori* prediction of filter applicability; and (iii) a unified treatment of both oversampling and undersampling. Experiments demonstrate that high-order interpolation filters enable high-fidelity variable-rate resampling. We quantitatively characterize the trade-offs among filter order, audio fidelity, and numerical stability, identify the root causes of instability artifacts, and establish a failureι’„θ­¦ mechanism.

Technology Category

Application Category

πŸ“ Abstract
Recurrent neural networks (RNNs) are effective at emulating the non-linear, stateful behavior of analog guitar amplifiers and distortion effects. Unlike the case of direct circuit simulation, RNNs have a fixed sample rate encoded in their model weights, making the sample rate non-adjustable during inference. Recent work has proposed increasing the sample rate of RNNs at inference (oversampling) by increasing the feedback delay length in samples, using a fractional delay filter for non-integer conversions. Here, we investigate the task of lowering the sample rate at inference (undersampling), and propose using an extrapolation filter to approximate the required fractional signal advance. We consider two filter design methods and analyse the impact of filter order on audio quality. Our results show that the correct choice of filter can give high quality results for both oversampling and undersampling; however, in some cases the sample rate adjustment leads to unwanted artefacts in the output signal. We analyse these failure cases through linearised stability analysis, showing that they result from instability around a fixed point. This approach enables an informed prediction of suitable interpolation filters for a given RNN model before runtime.
Problem

Research questions and friction points this paper is trying to address.

Audio Processing
Sample Rate Conversion
Recurrent Neural Networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extrapolation Filter
Sound Quality Optimization
RNN Model Stability
πŸ”Ž Similar Papers
No similar papers found.