Tracking the Copyright of Large Vision-Language Models through Parameter Learning Adversarial Images

📅 2025-02-23

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

To address unauthorized fine-tuning of Large Vision-Language Models (LVLMs) leading to copyright loss, this paper proposes a black-box copyright tracing method that requires no modification to the original model. Our core innovation is the Parameter Learning Attack (PLA) mechanism: leveraging gradient-guided reverse optimization, we generate adversarial images that reliably trigger pre-specified watermark outputs across both the original LVLM and its diverse fine-tuned variants—including LoRA, full-parameter, and Adapter adaptations. The method exhibits strong robustness against fine-tuning operations, enables post-deployment watermarking, and introduces zero performance degradation. Extensive evaluation across multiple architectures, fine-tuning strategies, and cross-dataset settings demonstrates an average 23.6% improvement in copyright identification accuracy over state-of-the-art baselines. To our knowledge, this is the first approach to achieve non-intrusive, post-deployment copyright watermarking for LVLMs, effectively bridging a critical technical gap in LVLM intellectual property protection.

Technology Category

Application Category

📝 Abstract

Large vision-language models (LVLMs) have demonstrated remarkable image understanding and dialogue capabilities, allowing them to handle a variety of visual question answering tasks. However, their widespread availability raises concerns about unauthorized usage and copyright infringement, where users or individuals can develop their own LVLMs by fine-tuning published models. In this paper, we propose a novel method called Parameter Learning Attack (PLA) for tracking the copyright of LVLMs without modifying the original model. Specifically, we construct adversarial images through targeted attacks against the original model, enabling it to generate specific outputs. To ensure these attacks remain effective on potential fine-tuned models to trigger copyright tracking, we allow the original model to learn the trigger images by updating parameters in the opposite direction during the adversarial attack process. Notably, the proposed method can be applied after the release of the original model, thus not affecting the model's performance and behavior. To simulate real-world applications, we fine-tune the original model using various strategies across diverse datasets, creating a range of models for copyright verification. Extensive experiments demonstrate that our method can more effectively identify the original copyright of fine-tuned models compared to baseline methods. Therefore, this work provides a powerful tool for tracking copyrights and detecting unlicensed usage of LVLMs.

Problem

Research questions and friction points this paper is trying to address.

Track copyright of large vision-language models

Prevent unauthorized fine-tuning of models

Identify original authorship in adapted models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter Learning Attack (PLA)

Adversarial images for copyright tracking

Unmodified original model usage

🔎 Similar Papers

HuRef: HUman-REadable Fingerprint for Large Language Models