Stay Unique, Stay Efficient: Preserving Model Personality in Multi-Task Merging

📅 2025-12-01

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing model merging methods often discard task-specific information in multi-task scenarios, leading to substantial performance degradation compared to single-task fine-tuned models—especially for semantically similar tasks. To address this, we propose the Decomposed Task Subspace (DTS) framework, which leverages Singular Value Decomposition (SVD) to extract and preserve personalized low-rank subspaces for each task. DTS further introduces a semantic-similarity-driven, data-free grouping strategy with adaptive threshold scaling to enable effective cross-task feature fusion. Crucially, DTS incurs only ~1% additional storage overhead per task while maintaining model lightweightness. Extensive experiments demonstrate that DTS consistently outperforms state-of-the-art model merging approaches across multiple benchmarks, achieving superior multi-task accuracy and generalization. Moreover, it exhibits enhanced zero-shot transfer capability to unseen tasks, validating its robustness and scalability in practical multi-task learning settings.

Technology Category

Application Category

📝 Abstract

Model merging has emerged as a promising paradigm for enabling multi-task capabilities without additional training. However, existing methods often experience substantial performance degradation compared with individually fine-tuned models, even on similar tasks, underscoring the need to preserve task-specific information. This paper proposes Decomposition, Thresholding, and Scaling (DTS), an approximation-based personalized merging framework that preserves task-specific information with minimal storage overhead. DTS first applies singular value decomposition to the task-specific information and retains only a small subset of singular values and vectors. It then introduces a novel thresholding strategy that partitions singular vector elements into groups and assigns a scaling factor to each group. To enable generalization to unseen tasks, we further extend DTS with a variant that fuses task-specific information in a data-free manner based on the semantic similarity of task characteristics. Extensive experiments demonstrate that DTS consistently outperforms state-of-the-art baselines while requiring only 1% additional storage per task. Furthermore, experiments on unseen tasks show that the DTS variant achieves significantly better generalization performance. Our code is available at https://github.com/krumpguo/DTS.

Problem

Research questions and friction points this paper is trying to address.

Preserves model personality in multi-task merging

Minimizes storage overhead while maintaining performance

Enables generalization to unseen tasks efficiently

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses singular value decomposition to retain key information

Applies thresholding and scaling to grouped singular vectors

Generalizes to unseen tasks via semantic similarity fusion

🔎 Similar Papers

No similar papers found.

Authors to Follow