SEW: Strengthening Robustness of Black-box DNN Watermarking via Specificity Enhancement

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the vulnerability of black-box deep neural network watermarking, where the non-uniqueness of extraction keys—stemming from model generalization—enables attackers to reverse-engineer and remove embedded watermarks. To counter this, we introduce the novel concept of “watermark specificity” and propose a Specificity-Enhanced Watermarking (SEW) method that amplifies the model’s response to the correct key while suppressing activation by approximate keys. SEW achieves significantly improved robustness against removal attacks without compromising model utility or verification accuracy. Extensive experiments on three mainstream benchmarks demonstrate that SEW effectively resists six state-of-the-art removal attacks, substantially enhancing the security of black-box neural network watermarking.

Technology Category

Application Category

📝 Abstract

To ensure the responsible distribution and use of open-source deep neural networks (DNNs), DNN watermarking has become a crucial technique to trace and verify unauthorized model replication or misuse. In practice, black-box watermarks manifest as specific predictive behaviors for specially crafted samples. However, due to the generalization nature of DNNs, the keys to extracting the watermark message are not unique, which would provide attackers with more opportunities. Advanced attack techniques can reverse-engineer approximate replacements for the original watermark keys, enabling subsequent watermark removal. In this paper, we explore black-box DNN watermarking specificity, which refers to the accuracy of a watermark's response to a key. Using this concept, we introduce Specificity-Enhanced Watermarking (SEW), a new method that improves specificity by reducing the association between the watermark and approximate keys. Through extensive evaluation using three popular watermarking benchmarks, we validate that enhancing specificity significantly contributes to strengthening robustness against removal attacks. SEW effectively defends against six state-of-the-art removal attacks, while maintaining model usability and watermark verification performance.

Problem

Research questions and friction points this paper is trying to address.

black-box DNN watermarking

watermark robustness

key uniqueness

removal attacks

specificity

Innovation

Methods, ideas, or system contributions that make the work stand out.

black-box watermarking

specificity enhancement

robustness