Defending Unauthorized Model Merging via Dual-Stage Weight Protection

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

To address intellectual property infringement and accountability violations caused by unauthorized model merging—where “free-riders” illicitly combine fine-tuned models to construct multi-capability models—this paper proposes MergeGuard, a two-stage weight protection framework. MergeGuard introduces a novel synergistic mechanism of gradient redistribution and subspace perturbation: it employs L2 regularization to drive inter-layer information reconstruction while applying structured perturbations to disrupt curvature compatibility of the loss manifold. Crucially, it preserves original model performance (accuracy degradation <1.5%) while completely breaking merge compatibility. Extensive experiments across mainstream architectures—including ViT, Llama2, Gemma2, and Mistral—demonstrate that unauthorized merged models suffer up to 90% accuracy drop, validating MergeGuard’s strong robustness and practical efficacy in safeguarding model weights against unauthorized integration.

Technology Category

Application Category

📝 Abstract

The rapid proliferation of pretrained models and open repositories has made model merging a convenient yet risky practice, allowing free-riders to combine fine-tuned models into a new multi-capability model without authorization. Such unauthorized model merging not only violates intellectual property rights but also undermines model ownership and accountability. To address this issue, we present MergeGuard, a proactive dual-stage weight protection framework that disrupts merging compatibility while maintaining task fidelity. In the first stage, we redistribute task-relevant information across layers via L2-regularized optimization, ensuring that important gradients are evenly dispersed. In the second stage, we inject structured perturbations to misalign task subspaces, breaking curvature compatibility in the loss landscape. Together, these stages reshape the model's parameter geometry such that merged models collapse into destructive interference while the protected model remains fully functional. Extensive experiments on both vision (ViT-L-14) and language (Llama2, Gemma2, Mistral) models demonstrate that MergeGuard reduces merged model accuracy by up to 90% with less than 1.5% performance loss on the protected model.

Problem

Research questions and friction points this paper is trying to address.

Preventing unauthorized merging of fine-tuned models without permission

Protecting intellectual property rights and model ownership integrity

Disrupting merging compatibility while maintaining original model performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Redistributes task-relevant information across layers

Injects structured perturbations to misalign task subspaces

Reshapes parameter geometry to disrupt merging compatibility

🔎 Similar Papers

No similar papers found.