PMark: Towards Robust and Distortion-free Semantic-level Watermarking with Channel Constraints

📅 2025-09-25

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Existing semantic watermarking (SWM) methods lack theoretical robustness guarantees against text modification and paraphrasing attacks, and rejection sampling–based generation often induces distributional distortion. To address these limitations, we propose PMark—a novel framework that introduces a surrogate function paradigm integrating dynamic median estimation and multi-channel constraints to ensure watermark robustness without distributional shift. We replace conventional rejection sampling with static sampling optimization, substantially improving embedding efficiency. We provide rigorous theoretical analysis proving PMark’s strong robustness and zero-distortion property under semantic perturbations. Extensive experiments demonstrate that PMark achieves significantly higher detection accuracy than state-of-the-art methods across diverse adversarial attacks—including synonym substitution, sentence reordering, and large-language-model–driven paraphrasing—while preserving text quality and inference efficiency. PMark establishes a verifiable, high-fidelity paradigm for provenance tracking of large language model–generated text.

Technology Category

Application Category

📝 Abstract

Semantic-level watermarking (SWM) for large language models (LLMs) enhances watermarking robustness against text modifications and paraphrasing attacks by treating the sentence as the fundamental unit. However, existing methods still lack strong theoretical guarantees of robustness, and reject-sampling-based generation often introduces significant distribution distortions compared with unwatermarked outputs. In this work, we introduce a new theoretical framework on SWM through the concept of proxy functions (PFs) $unicode{x2013}$ functions that map sentences to scalar values. Building on this framework, we propose PMark, a simple yet powerful SWM method that estimates the PF median for the next sentence dynamically through sampling while enforcing multiple PF constraints (which we call channels) to strengthen watermark evidence. Equipped with solid theoretical guarantees, PMark achieves the desired distortion-free property and improves the robustness against paraphrasing-style attacks. We also provide an empirically optimized version that further removes the requirement for dynamical median estimation for better sampling efficiency. Experimental results show that PMark consistently outperforms existing SWM baselines in both text quality and robustness, offering a more effective paradigm for detecting machine-generated text. Our code will be released at [this URL](https://github.com/PMark-repo/PMark).

Problem

Research questions and friction points this paper is trying to address.

Enhancing robustness of semantic watermarking against text modifications

Eliminating distribution distortions in reject-sampling-based watermark generation

Providing theoretical guarantees for watermarking robustness against paraphrasing attacks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses proxy functions to map sentences to scalar values

Enforces multiple channel constraints for stronger watermark evidence

Dynamically estimates median through sampling for distortion-free outputs

🔎 Similar Papers

SWIFT: Semantic Watermarking for Image Forgery Thwarting