🤖 AI Summary
Existing watermarking methods are typically tailored to specific modalities or tasks and neglect the intrinsic architectural properties of Transformer models, resulting in limited generality and robustness. To address this, we propose the first modality-agnostic watermarking framework for ownership verification of pretrained Transformer models. Our approach innovatively exploits the permutation equivariance inherent in Transformers, designing a data-permutation-based trigger mechanism and a dual-weight decoupling structure within the embedding layer to strictly separate watermark embedding from the primary functional pathway. The method requires only lightweight fine-tuning and supports diverse downstream tasks across modalities—including text and image processing. Extensive evaluation on multiple state-of-the-art Transformer models demonstrates watermark detection accuracy exceeding 99.7%, significantly improved resilience against removal and fine-tuning attacks, and strong cross-task generalizability with high computational efficiency.
📝 Abstract
Watermarking is a critical tool for model ownership verification. However, existing watermarking techniques are often designed for specific data modalities and downstream tasks, without considering the inherent architectural properties of the model. This lack of generality and robustness underscores the need for a more versatile watermarking approach. In this work, we investigate the properties of Transformer models and propose TokenMark, a modality-agnostic, robust watermarking system for pre-trained models, leveraging the permutation equivariance property. TokenMark embeds the watermark by fine-tuning the pre-trained model on a set of specifically permuted data samples, resulting in a watermarked model that contains two distinct sets of weights -- one for normal functionality and the other for watermark extraction, the latter triggered only by permuted inputs. Extensive experiments on state-of-the-art pre-trained models demonstrate that TokenMark significantly improves the robustness, efficiency, and universality of model watermarking, highlighting its potential as a unified watermarking solution.