🤖 AI Summary
This work addresses the lack of a unified differentiable framework for audio effect modeling. We introduce DiffFX, the first open-source PyTorch framework supporting both black-box and gray-box modeling of audio effects. Methodologically, we propose a novel gray-box paradigm that integrates differentiable digital signal processing (DSP) modules with neural controllers, enabling unified modeling of both parametric and non-parametric signal chains. The framework incorporates black-box architectures—including WaveNet and LSTM—alongside conditional modeling mechanisms, and provides an end-to-end toolchain for automated training, evaluation, visualization, and standardized benchmarking. Our contributions are threefold: (1) the first open-source, end-to-end differentiable infrastructure dedicated to audio effect modeling; (2) significantly improved model generalization and real-time controllability, achieving high-fidelity emulation across diverse effect types (e.g., distortion, compression, delay); and (3) publicly released code that has been widely adopted, advancing research in interpretable AI for audio processing.
📝 Abstract
We present NablAFx, an open-source framework developed to support research in differentiable black-box and gray-box modeling of audio effects. Built in PyTorch, NablAFx offers a versatile ecosystem to configure, train, evaluate, and compare various architectural approaches. It includes classes to manage model architectures, datasets, and training, along with features to compute and log losses, metrics and media, and plotting functions to facilitate detailed analysis. It incorporates implementations of established black-box architectures and conditioning methods, as well as differentiable DSP blocks and controllers, enabling the creation of both parametric and non-parametric gray-box signal chains. The code is accessible at https://github.com/mcomunita/nablafx.