Making Models Unmergeable via Scaling-Sensitive Loss Landscape

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the governance risks posed by unauthorized model merging, which can circumvent safety alignment or licensing restrictions. To counter this, the authors propose Trap², a framework that reshapes the model’s loss landscape during fine-tuning through a scaling-sensitive loss function. This design ensures high performance under legitimate usage while significantly degrading model utility when subjected to illicit merging. Trap² represents the first architecture-agnostic defense mechanism against model merging, embedding protection directly into the training process and supporting both full-model and adapter-based deployment paradigms. Experimental results demonstrate that Trap² effectively suppresses unauthorized merging attempts without compromising model efficacy in compliant scenarios.

Technology Category

Application Category

📝 Abstract
The rise of model hubs has made it easier to access reusable model components, making model merging a practical tool for combining capabilities. Yet, this modularity also creates a \emph{governance gap}: downstream users can recompose released weights into unauthorized mixtures that bypass safety alignment or licensing terms. Because existing defenses are largely post-hoc and architecture-specific, they provide inconsistent protection across diverse architectures and release formats in practice. To close this gap, we propose \textsc{Trap}$^{2}$, an architecture-agnostic protection framework that encodes protection into the update during fine-tuning, regardless of whether they are released as adapters or full models. Instead of relying on architecture-dependent approaches, \textsc{Trap}$^{2}$ uses weight re-scaling as a simple proxy for the merging process. It keeps released weights effective in standalone use, but degrades them under re-scaling that often arises in merging, undermining unauthorized merging.
Problem

Research questions and friction points this paper is trying to address.

model merging
governance gap
unauthorized recomposition
safety alignment
licensing bypass
Innovation

Methods, ideas, or system contributions that make the work stand out.

model merging
weight re-scaling
architecture-agnostic protection
fine-tuning
safety alignment
🔎 Similar Papers
No similar papers found.
M
Minwoo Jang
Graduate School of AI, POSTECH, Pohang, Republic of Korea
Hoyoung Kim
Hoyoung Kim
POSTECH
Machine Learning
J
Jabin Koo
Department of CSE, POSTECH, Pohang, Republic of Korea
Jungseul Ok
Jungseul Ok
Associate Professor, CSE/AI, POSTECH
Reinforcement LearningMachine Learning