Differentiable Black-box and Gray-box Modeling of Nonlinear Audio Effects

📅 2025-02-20

📈 Citations: 0

✨ Influential: 0

career value

256K/year

🤖 AI Summary

This study systematically compares black-box and gray-box differentiable models for modeling dozens of nonlinear audio effects (e.g., distortion, compression, fuzz), aiming to clarify architectural applicability boundaries and enable cross-device deployment. Method: We propose the first time-varying gray-box modeling framework supporting dynamic parameter identification; construct and open-source ToneTwist AFx—the first large-scale, community-driven audio effects dataset; and conduct the first integrated evaluation combining objective metrics (LSD, PESQ), cross-hardware benchmarking, and double-blind subjective listening tests. Our approach integrates differentiable neural networks, time-varying system identification, and end-to-end audio modeling. Contribution/Results: Gray-box architectures significantly outperform black-box counterparts in fidelity and generalization—especially for strongly nonlinear effects. All code, trained models, dataset, and evaluation protocols are publicly released to foster reproducible research and practical deployment.

Technology Category

Application Category

📝 Abstract

Audio effects are extensively used at every stage of audio and music content creation. The majority of differentiable audio effects modeling approaches fall into the black-box or gray-box paradigms; and most models have been proposed and applied to nonlinear effects like guitar amplifiers, overdrive, distortion, fuzz and compressor. Although a plethora of architectures have been introduced for the task at hand there is still lack of understanding on the state of the art, since most publications experiment with one type of nonlinear audio effect and a very small number of devices. In this work we aim to shed light on the audio effects modeling landscape by comparing black-box and gray-box architectures on a large number of nonlinear audio effects, identifying the most suitable for a wide range of devices. In the process, we also: introduce time-varying gray-box models and propose models for compressor, distortion and fuzz, publish a large dataset for audio effects research - ToneTwist AFx https://github.com/mcomunita/tonetwist-afx-dataset - that is also the first open to community contributions, evaluate models on a variety of metrics and conduct extensive subjective evaluation. Code https://github.com/mcomunita/nablafx and supplementary material https://github.com/mcomunita/nnlinafx-supp-material are also available.

Problem

Research questions and friction points this paper is trying to address.

Compare black-box and gray-box audio modeling.

Identify best models for nonlinear audio effects.

Introduce and evaluate new time-varying gray-box models.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Black-box modeling techniques

Gray-box modeling approaches

Time-varying gray-box models

🔎 Similar Papers

No similar papers found.