A Game Between the Defender and the Attacker for Trigger-based Black-box Model Watermarking

📅 2025-01-02

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

In black-box settings, deep neural network (DNN) models are vulnerable to unauthorized copying, and existing watermarking techniques lack theoretical guarantees of robustness against adversarial attacks. Method: This paper introduces game theory into model watermarking for the first time, establishing a two-player game framework between a defender and an attacker. It formally models their strategy spaces, payoff functions, and black-box query constraints. We propose a trigger-based watermarking mechanism with provable security, rigorously defining conditions for Nash equilibrium existence. Contribution/Results: Our approach yields the first black-box watermarking scheme with theoretically provable robustness—unlike heuristic or data-driven alternatives. By bridging the gap in formal security analysis, it establishes a new paradigm for designing and evaluating watermarking schemes, enabling rigorous guarantees against model extraction attacks under realistic black-box access constraints.

Technology Category

Application Category

📝 Abstract

Watermarking deep neural network (DNN) models has attracted a great deal of attention and interest in recent years because of the increasing demand to protect the intellectual property of DNN models. Many practical algorithms have been proposed by covertly embedding a secret watermark into a given DNN model through either parametric/structural modulation or backdooring against intellectual property infringement from the attacker while preserving the model performance on the original task. Despite the performance of these approaches, the lack of basic research restricts the algorithmic design to either a trial-based method or a data-driven technique. This has motivated the authors in this paper to introduce a game between the model attacker and the model defender for trigger-based black-box model watermarking. For each of the two players, we construct the payoff function and determine the optimal response, which enriches the theoretical foundation of model watermarking and may inspire us to develop novel schemes in the future.

Problem

Research questions and friction points this paper is trying to address.

Deep Learning Model

Integrity Assurance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Game Theory

Deep Learning Watermarking

Model Protection

🔎 Similar Papers

Agentic Copyright Watermarking against Adversarial Evidence Forgery with Purification-Agnostic Curriculum Proxy Learning

2024-09-03Citations: 0

Authors to Follow