From Coefficients to Directions: Rethinking Model Merging with Directional Alignment

πŸ“… 2025-11-29
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing model fusion methods neglect the directional structure inherent in both parameter and feature spaces, leading to directional inconsistency and compromised structural coherence; coefficient-based optimization implicitly assumes directional compatibility, yet independently trained models often exhibit markedly divergent directional patterns. This work introduces the first geometric framework for directional alignment in model fusion, systematically characterizing the directional mismatch problem and establishing a novel paradigm that jointly enforces structural consistency in both parameter and feature spaces. Methodologically, we integrate parameter decomposition, subspace alignment, and feature-direction calibration, leveraging the neural collapse phenomenon to achieve cross-model directional consistency. Extensive experiments across multiple scales and tasks demonstrate significant improvements over state-of-the-art fusion methods, empirically validating the critical role of directional alignment in enhancing fusion performance.

Technology Category

Application Category

πŸ“ Abstract
Model merging has emerged as a practical paradigm for integrating multiple independently trained models into a single model without joint retraining. Previous studies have demonstrated the effectiveness of combining parameters through strategies such as parameter decomposition, coefficient optimization, and subspace learning, significantly reducing the need for expensive joint training and achieving strong empirical performance across diverse tasks. However, these approaches predominantly treat merging as a problem of parameter space decomposition or fusion coefficient optimization, while overlooking the critical role of directional information in both parameter and feature spaces. In practice, naΓ―ve merging introduces inconsistencies in dominant parameter directions and disrupts structural coherence across models, which can degrade performance. Moreover, coefficient-based optimization methods implicitly assume compatible feature-space directions across models. However, Neural Collapse indicates that class features follow structured directional patterns, which may differ across independently trained models, making coefficient optimization alone insufficient. In this work, we emphasize the importance of emph{directional alignment} and introduce a unified geometric framework, emph{Merging with Directional Alignment} (method{}), which aligns directional structures consistently in both the parameter and feature spaces. Our analysis shows that directional alignment improves structural coherence, and extensive experiments across benchmarks, model scales, and task configurations further validate the effectiveness of our approach.
Problem

Research questions and friction points this paper is trying to address.

Model merging lacks directional alignment in parameter and feature spaces.
Naive merging disrupts structural coherence and degrades model performance.
Coefficient optimization alone is insufficient due to differing feature directions.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Aligns parameter and feature space directions
Introduces geometric framework for model merging
Improves structural coherence across merged models
πŸ”Ž Similar Papers
No similar papers found.