Disentangling Task Interference within Neurons: Model Merging in Alignment with Neuronal Mechanisms

📅 2025-03-07

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Existing model merging methods suffer from neuron-level task interference in multi-task fusion, leading to performance degradation and poor interpretability. To address this, we propose NeuroMerging—the first training-free merging framework grounded in neuron subspace decomposition. It decouples neuron sensitivity and input adaptivity into orthogonal subspaces, establishing a dual-stream modeling mechanism that jointly captures task-specific sensitivity and input-dependent adaptivity. Furthermore, it introduces task arithmetic alignment to achieve knowledge alignment at the neural mechanistic level. NeuroMerging is the first to systematically integrate neuron subspace analysis into model merging, effectively suppressing cross-task interference and enabling training-free fusion across tasks and modalities. Evaluated on multi-task vision-and-language benchmarks, it achieves an average performance gain of 3.2%, while significantly improving model interpretability and generalization stability.

Technology Category

Application Category

📝 Abstract

Fine-tuning pre-trained models on targeted datasets enhances task-specific performance but often comes at the expense of generalization. Model merging techniques, which integrate multiple fine-tuned models into a single multi-task model through task arithmetic at various levels: model, layer, or parameter, offer a promising solution. However, task interference remains a fundamental challenge, leading to performance degradation and suboptimal merged models. Existing approaches largely overlook the fundamental role of individual neurons and their connectivity, resulting in a lack of interpretability in both the merging process and the merged models. In this work, we present the first study on the impact of neuronal alignment in model merging. We decompose task-specific representations into two complementary neuronal subspaces that regulate neuron sensitivity and input adaptability. Leveraging this decomposition, we introduce NeuroMerging, a novel merging framework developed to mitigate task interference within neuronal subspaces, enabling training-free model fusion across diverse tasks. Through extensive experiments, we demonstrate that NeuroMerging achieves superior performance compared to existing methods on multi-task benchmarks across both vision and natural language domains. Our findings highlight the importance of aligning neuronal mechanisms in model merging, offering new insights into mitigating task interference and improving knowledge fusion.

Problem

Research questions and friction points this paper is trying to address.

Mitigates task interference in multi-task model merging.

Aligns neuronal mechanisms to improve model interpretability.

Enhances performance across vision and natural language tasks.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decompose task-specific representations into neuronal subspaces

Introduce NeuroMerging for training-free model fusion

Align neuronal mechanisms to mitigate task interference

🔎 Similar Papers

No similar papers found.