🤖 AI Summary
This study systematically compares black-box and gray-box differentiable models for modeling dozens of nonlinear audio effects (e.g., distortion, compression, fuzz), aiming to clarify architectural applicability boundaries and enable cross-device deployment. Method: We propose the first time-varying gray-box modeling framework supporting dynamic parameter identification; construct and open-source ToneTwist AFx—the first large-scale, community-driven audio effects dataset; and conduct the first integrated evaluation combining objective metrics (LSD, PESQ), cross-hardware benchmarking, and double-blind subjective listening tests. Our approach integrates differentiable neural networks, time-varying system identification, and end-to-end audio modeling. Contribution/Results: Gray-box architectures significantly outperform black-box counterparts in fidelity and generalization—especially for strongly nonlinear effects. All code, trained models, dataset, and evaluation protocols are publicly released to foster reproducible research and practical deployment.
📝 Abstract
Audio effects are extensively used at every stage of audio and music content creation. The majority of differentiable audio effects modeling approaches fall into the black-box or gray-box paradigms; and most models have been proposed and applied to nonlinear effects like guitar amplifiers, overdrive, distortion, fuzz and compressor. Although a plethora of architectures have been introduced for the task at hand there is still lack of understanding on the state of the art, since most publications experiment with one type of nonlinear audio effect and a very small number of devices. In this work we aim to shed light on the audio effects modeling landscape by comparing black-box and gray-box architectures on a large number of nonlinear audio effects, identifying the most suitable for a wide range of devices. In the process, we also: introduce time-varying gray-box models and propose models for compressor, distortion and fuzz, publish a large dataset for audio effects research - ToneTwist AFx https://github.com/mcomunita/tonetwist-afx-dataset - that is also the first open to community contributions, evaluate models on a variety of metrics and conduct extensive subjective evaluation. Code https://github.com/mcomunita/nablafx and supplementary material https://github.com/mcomunita/nnlinafx-supp-material are also available.