🤖 AI Summary
In virtual analog audio modeling, injecting time-varying control parameters into recurrent neural networks (RNNs) via static concatenation causes audible artifacts—such as clicks, pops, and distortion—due to abrupt output transitions under parameter changes.
Method: We propose an asymptotically stable RNN architecture that enforces explicit asymptotic stability constraints directly within the hidden state update dynamics. This ensures artifact-free, smooth output responses to dynamic control signals while guaranteeing Lyapunov stability under zero input. Crucially, the method preserves the original audio signal path, requiring no architectural modifications to the core synthesis network.
Contribution/Results: Our approach is general-purpose, interpretable, and eliminates control-induced artifacts without post-processing or signal-domain regularization. Experiments demonstrate significant suppression of discontinuities during control parameter jumps, markedly improving perceptual audio fidelity and deployment robustness in industrial settings. This work establishes a new paradigm for conditional control injection in neural audio synthesis.
📝 Abstract
Recurrent neural networks are used in virtual analog modeling applications to digitally replicate the sound of analog hardware audio processors. The controls of hardware devices can be used as a conditioning input to these networks. A common method for introducing control conditioning to these models is the direct static concatenation of controls with input audio samples, which we show produces audio artifacts under time-varied conditioning. Here we derive constraints for asymptotically stable variants of commonly used recurrent neural networks and demonstrate that asymptotical stability in recurrent neural networks can eliminate audio artifacts from the model output under zero input and time-varied conditioning. Furthermore, our results suggest a possible general solution to mitigate conditioning-induced artifacts in other audio neural network architectures, such as convolutional and state-space models.