Model Merging on Loss Landscape: A Geometry Perspective

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

This work addresses limitations in existing model merging approaches, which either neglect the geometric structure of the loss landscape or rely on computationally expensive full-space Hessian approximations, thereby constraining effective knowledge integration. The authors formulate model merging as computing the Fréchet mean on a Riemannian manifold within the low-rank subspace spanned by task vectors, employing the expected Hessian as the metric. This formulation establishes, for the first time, a theoretical link between local curvature and epistemic uncertainty. A rigorous error bound for the merged model is derived, and curvature-aware and spectral methods are shown to be special cases of this unified framework. Experiments on eight image classification tasks using fine-tuned CLIP-ViT models demonstrate that the proposed method consistently outperforms existing baselines in both average and worst-case cross-task accuracy across all backbone architectures.

📝 Abstract

Model merging offers a promising avenue for knowledge integration and parallel development without retraining. Yet, existing methods either ignore the geometry of the loss landscape or rely on intractable full-space Hessian approximations. We propose EpiMer, a framework that casts model merging as solving the Fréchet mean on a Riemannian manifold and restricts the computation to a low-rank subspace spanned by the task vectors. With the expected Hessian as the metric, we reveal a connection between local curvature and epistemic uncertainty of the parameters. Our theoretical analysis decomposes the merging error bound into the subspace Fréchet variance and the residual energy, and provides a closed-form characterization of when curvature-aware merging provably outperforms flat-geometry methods. In addition, our framework unifies both curvature-aware methods and recent spectral methods as special cases of the subspace Fréchet mean with different geometric metrics. Merging fine-tuned CLIP-ViT models on eight image classification tasks, Epistemic Merging strictly outperforms the baselines on all three CLIP-ViT backbones at matched rank, improving the across-task average accuracy and worst-task accuracy on every backbone.

Problem

Research questions and friction points this paper is trying to address.

model merging

loss landscape

geometry

Hessian approximation

knowledge integration

Innovation

Methods, ideas, or system contributions that make the work stand out.

model merging

loss landscape geometry

Fréchet mean