AdaRank: Adaptive Rank Pruning for Enhanced Model Merging

📅 2025-03-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In multi-task model merging, fixed low-rank truncation often induces cross-task interference and performance degradation. To address this, we propose a test-time adaptive singular direction selection mechanism: task vectors are modeled via SVD decomposition, and harmful singular components are dynamically pruned online based on entropy minimization—eliminating the need for manually specified uniform rank thresholds. We introduce the first rank-pruning paradigm that differentially preserves optimal information per task and layer, effectively mitigating detrimental overlap among task vectors. Our method is architecture-agnostic and scales to diverse backbone networks and task configurations. On standard multi-task merging benchmarks, it achieves state-of-the-art performance, narrowing the gap between merged models and fully fine-tuned counterparts to approximately 1%. This yields substantial improvements in robustness and generalization across heterogeneous tasks.

Technology Category

Application Category

📝 Abstract
Model merging has emerged as a promising approach for unifying independently fine-tuned models into an integrated framework, significantly enhancing computational efficiency in multi-task learning. Recently, several SVD-based techniques have been introduced to exploit low-rank structures for enhanced merging, but their reliance on such manually designed rank selection often leads to cross-task interference and suboptimal performance. In this paper, we propose AdaRank, a novel model merging framework that adaptively selects the most beneficial singular directions of task vectors to merge multiple models. We empirically show that the dominant singular components of task vectors can cause critical interference with other tasks, and that naive truncation across tasks and layers degrades performance. In contrast, AdaRank dynamically prunes the singular components that cause interference and offers an optimal amount of information to each task vector by learning to prune ranks during test-time via entropy minimization. Our analysis demonstrates that such method mitigates detrimental overlaps among tasks, while empirical results show that AdaRank consistently achieves state-of-the-art performance with various backbones and number of tasks, reducing the performance gap between fine-tuned models to nearly 1%.
Problem

Research questions and friction points this paper is trying to address.

Adaptive rank pruning for optimal model merging
Mitigating cross-task interference in SVD-based merging
Enhancing performance via dynamic singular component pruning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive rank pruning for model merging
Dynamic singular component pruning via entropy
Optimal information allocation per task vector
🔎 Similar Papers
No similar papers found.