Agentic Copyright Watermarking against Adversarial Evidence Forgery with Purification-Agnostic Curriculum Proxy Learning

📅 2024-09-03

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Existing black-box AI model watermarking schemes are vulnerable to adversarial evidence forgery attacks, undermining copyright protection. Method: This paper proposes a self-authenticating black-box watermarking protocol featuring (i) a novel hash-driven self-authentication mechanism that explicitly models adversarial perturbations for enhanced robustness; (ii) a purification-agnostic curriculum-based proxy learning framework that decouples watermark embedding from model purification dependencies; and (iii) lightweight, efficient embedding via proxy network distillation. Contributions/Results: We identify a new paradigm of evidence forgery attacks; achieve >92% watermark survival rate under multiple adversarial attacks; incur <0.8% accuracy degradation on downstream tasks; and empirically validate the scheme’s reliability, auditability, and legal admissibility in copyright attribution.

Technology Category

Application Category

📝 Abstract

With the proliferation of AI agents in various domains, protecting the ownership of AI models has become crucial due to the significant investment in their development. Unauthorized use and illegal distribution of these models pose serious threats to intellectual property, necessitating effective copyright protection measures. Model watermarking has emerged as a key technique to address this issue, embedding ownership information within models to assert rightful ownership during copyright disputes. This paper presents several contributions to model watermarking: a self-authenticating black-box watermarking protocol using hash techniques, a study on evidence forgery attacks using adversarial perturbations, a proposed defense involving a purification step to counter adversarial attacks, and a purification-agnostic curriculum proxy learning method to enhance watermark robustness and model performance. Experimental results demonstrate the effectiveness of these approaches in improving the security, reliability, and performance of watermarked models.

Problem

Research questions and friction points this paper is trying to address.

AI copyright

model protection

intellectual property

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Verifying Watermark Technique

Robustness against Adversarial Attacks

🔎 Similar Papers

From Intentions to Techniques: A Comprehensive Taxonomy and Challenges in Text Watermarking for Large Language Models

2024-06-17North American Chapter of the Association for Computational LinguisticsCitations: 2