Resampling Filter Design for Multirate Neural Audio Effect Processing

๐Ÿ“… 2025-01-30
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

208K/year
๐Ÿค– AI Summary
Conventional neural audio effect models operate at fixed sampling rates, limiting their adaptability to time-stretched or variable-rate audio signals. Method: This paper proposes an end-to-end variable-sample-rate neural processing framework featuring a two-stage resampling filter: a low-latency half-band IIR filter for coarse rate adjustment at the front end, followed by a high-fidelity Kaiser-windowed FIR interpolator/decimator for precise fractional-rate conversion. Contribution/Results: The cascaded architecture achieves real-time performance (<1 ms latency in typical scenarios), low computational overhead, and robust anti-aliasingโ€”without modifying the underlying neural network architecture. Experiments demonstrate substantial improvements over conventional model redesign approaches for distortion-type effects, enabling flexible switching among integer and arbitrary fractional sampling rates. The framework is general-purpose and deployment-ready across diverse audio processing applications.

Technology Category

Application Category

๐Ÿ“ Abstract
Neural networks have become ubiquitous in audio effects modelling, especially for guitar amplifiers and distortion pedals. One limitation of such models is that the sample rate of the training data is implicitly encoded in the model weights and therefore not readily adjustable at inference. Recent work explored modifications to recurrent neural network architecture to approximate a sample rate independent system, enabling audio processing at a rate that differs from the original training rate. This method works well for integer oversampling and can reduce aliasing caused by nonlinear activation functions. For small fractional changes in sample rate, fractional delay filters can be used to approximate sample rate independence, but in some cases this method fails entirely. Here, we explore the use of signal resampling at the input and output of the neural network as an alternative solution. We investigate several resampling filter designs and show that a two-stage design consisting of a half-band IIR filter cascaded with a Kaiser window FIR filter can give similar or better results to the previously proposed model adjustment method with many fewer operations per sample and less than one millisecond of latency at typical audio rates. Furthermore, we investigate interpolation and decimation filters for the task of integer oversampling and show that cascaded half-band IIR and FIR designs can be used in conjunction with the model adjustment method to reduce aliasing in a range of distortion effect models.
Problem

Research questions and friction points this paper is trying to address.

Neural Networks
Audio Processing
Variable Speed
Innovation

Methods, ideas, or system contributions that make the work stand out.

Resampling Filters
Neural Network Audio Processing
Efficient Speed Modification
๐Ÿ”Ž Similar Papers