Enhancing Efficiency and Performance in Deepfake Audio Detection through Neuron-level dropin & Neuroplasticity Mechanisms

📅 2026-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of balancing performance and efficiency in deepfake audio detection, where existing methods are constrained by fixed model architectures. Inspired by mammalian neuroplasticity, the authors propose a novel neuron-level dynamic mechanism—comprising “dropin” and “plasticity”—that enables flexible adjustment of model parameters across diverse architectures such as ResNet, GRNN, and Wav2Vec without requiring full retraining. This approach overcomes the limitation of low-rank adaptation, which is typically restricted to attention-based modules. Evaluated on the ASVSpoof2019 LA/PA and FakeorReal datasets, the method significantly enhances detection performance, reducing the Equal Error Rate by up to 39% and 66%, respectively, while simultaneously improving computational efficiency.

Technology Category

Application Category

📝 Abstract
Current audio deepfake detection has achieved remarkable performance using diverse deep learning architectures such as ResNet, and has seen further improvements with the introduction of large models (LMs) like Wav2Vec. The success of large language models (LLMs) further demonstrates the benefits of scaling model parameters, but also highlights one bottleneck where performance gains are constrained by parameter counts. Simply stacking additional layers, as done in current LLMs, is computationally expensive and requires full retraining. Furthermore, existing low-rank adaptation methods are primarily applied to attention-based architectures, which limits their scope. Inspired by the neuronal plasticity observed in mammalian brains, we propose novel algorithms, dropin and further plasticity, that dynamically adjust the number of neurons in certain layers to flexibly modulate model parameters. We evaluate these algorithms on multiple architectures, including ResNet, Gated Recurrent Neural Networks, and Wav2Vec. Experimental results using the widely recognised ASVSpoof2019 LA, PA, and FakeorReal dataset demonstrate consistent improvements in computational efficiency with the dropin approach and a maximum of around 39% and 66% relative reduction in Equal Error Rate with the dropin and plasticity approach among these dataset, respectively. The code and supplementary material are available at Github link.
Problem

Research questions and friction points this paper is trying to address.

deepfake audio detection
model scaling
computational efficiency
parameter adaptation
neuroplasticity
Innovation

Methods, ideas, or system contributions that make the work stand out.

neuron-level dropin
neuroplasticity
deepfake audio detection
parameter-efficient adaptation
dynamic neuron adjustment
🔎 Similar Papers
No similar papers found.
Y
Yupei Li
Department of Computing and Chair of Health Informatics, Imperial College London and Technical University of Munich
S
Shuaijie Shao
Department of Computer Science, University College London
M
Manuel Milling
Chair of Health Informatics, Technical University of Munich
Björn Schuller
Björn Schuller
Professor, Technische Universität München (TUM) / Imperial College London & CSO, audEERING
Health InformaticsDigital HealthAIAffective ComputingComputer Audition