RegionMarker: A Region-Triggered Semantic Watermarking Framework for Embedding-as-a-Service Copyright Protection

📅 2025-11-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Embedding-as-a-Service (EaaS) is vulnerable to model extraction attacks, posing risks of intellectual property leakage and financial loss; existing watermarking methods lack sufficient robustness. This paper proposes a region-triggered semantic watermarking framework: it dynamically partitions trigger regions in a secret, low-dimensional subspace obtained via dimensionality reduction, directly leverages text embeddings as semantically coupled watermark signals, and achieves global coverage and strong imperceptibility through randomized projection and region selection. Its key innovation lies in the first joint design of trigger mechanisms, semantic embedding utilization, and secret-dimensional projection—significantly enhancing resilience against rewriting, dimensional perturbation, and model distillation attacks. Experiments on multiple benchmark datasets demonstrate watermark detection accuracy exceeding 98%, substantially outperforming state-of-the-art methods, while incurring negligible overhead on service performance.

Technology Category

Application Category

📝 Abstract
Embedding-as-a-Service (EaaS) is an effective and convenient deployment solution for addressing various NLP tasks. Nevertheless, recent research has shown that EaaS is vulnerable to model extraction attacks, which could lead to significant economic losses for model providers. For copyright protection, existing methods inject watermark embeddings into text embeddings and use them to detect copyright infringement. However, current watermarking methods often resist only a subset of attacks and fail to provide extit{comprehensive} protection. To this end, we present the region-triggered semantic watermarking framework called RegionMarker, which defines trigger regions within a low-dimensional space and injects watermarks into text embeddings associated with these regions. By utilizing a secret dimensionality reduction matrix to project onto this subspace and randomly selecting trigger regions, RegionMarker makes it difficult for watermark removal attacks to evade detection. Furthermore, by embedding watermarks across the entire trigger region and using the text embedding as the watermark, RegionMarker is resilient to both paraphrasing and dimension-perturbation attacks. Extensive experiments on various datasets show that RegionMarker is effective in resisting different attack methods, thereby protecting the copyright of EaaS.
Problem

Research questions and friction points this paper is trying to address.

Protecting EaaS copyright against model extraction attacks
Developing comprehensive watermarking resilient to multiple attack types
Embedding region-triggered watermarks in text embeddings for detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

RegionMarker defines trigger regions in low-dimensional space
It injects watermarks into embeddings using secret matrix
Framework resists paraphrasing and dimension-perturbation attacks
🔎 Similar Papers
No similar papers found.
S
Shufan Yang
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Z
Zifeng Cheng
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Zhiwei Jiang
Zhiwei Jiang
Nanjing University
Natural Language Processing
Y
Yafeng Yin
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
C
Cong Wang
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Shiping Ge
Shiping Ge
Independent Researcher
Multimodal LearningData Mining
Yuchen Fu
Yuchen Fu
Nanjing University
计算机视觉、多模态学习
Qing Gu
Qing Gu
Nanjing University