NeuroAda: Activating Each Neuron's Potential for Parameter-Efficient Fine-Tuning

πŸ“… 2025-10-21
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing parameter-efficient fine-tuning (PEFT) methods face a fundamental trade-off between fine-grained adaptability and memory efficiency: additive approaches (e.g., LoRA) are memory-efficient but suffer from limited expressivity, while selective in-place tuning achieves high accuracy at the cost of substantial GPU memory overhead. To address this, we propose NeuroAdaβ€”the first PEFT framework that jointly integrates connection importance estimation, selective adaptation, and trainable bypass modules. NeuroAda freezes the backbone parameters and activates only lightweight, task-specific bypass modules associated with the most critical connections. With ≀0.02% trainable parameters, it achieves state-of-the-art performance across 23+ NLP benchmarks while reducing CUDA memory consumption by up to 60%. NeuroAda thus breaks the longstanding efficiency-capacity trade-off in PEFT, enabling highly expressive yet ultra-memory-efficient adaptation.

Technology Category

Application Category

πŸ“ Abstract
Existing parameter-efficient fine-tuning (PEFT) methods primarily fall into two categories: addition-based and selective in-situ adaptation. The former, such as LoRA, introduce additional modules to adapt the model to downstream tasks, offering strong memory efficiency. However, their representational capacity is often limited, making them less suitable for fine-grained adaptation. In contrast, the latter directly fine-tunes a carefully chosen subset of the original model parameters, allowing for more precise and effective adaptation, but at the cost of significantly increased memory consumption. To reconcile this trade-off, we propose NeuroAda, a novel PEFT method that enables fine-grained model finetuning while maintaining high memory efficiency. Our approach first identifies important parameters (i.e., connections within the network) as in selective adaptation, and then introduces bypass connections for these selected parameters. During finetuning, only the bypass connections are updated, leaving the original model parameters frozen. Empirical results on 23+ tasks spanning both natural language generation and understanding demonstrate that NeuroAda achieves state-of-the-art performance with as little as $leq extbf{0.02}%$ trainable parameters, while reducing CUDA memory usage by up to 60%. We release our code here: https://github.com/FightingFighting/NeuroAda.git.
Problem

Research questions and friction points this paper is trying to address.

Reconciling memory efficiency with fine-grained adaptation in PEFT
Enabling precise parameter updates while maintaining low memory usage
Overcoming limitations of existing addition-based and selective adaptation methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

NeuroAda identifies key network connections for fine-tuning
It adds bypass connections to selected parameters only
Updates bypass connections while freezing original model parameters
πŸ”Ž Similar Papers
No similar papers found.