SimMark: A Robust Sentence-Level Similarity-Based Watermarking Algorithm for Large Language Models

πŸ“… 2025-02-05
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the challenge of tracing text generated by large language models (LLMs), this paper proposes SimMarkβ€”a post-hoc watermarking algorithm that requires no access to model internals (e.g., logits) and is compatible with API-only black-box LLMs. Methodologically, SimMark leverages semantic sentence embeddings (e.g., Sentence-BERT) and cosine similarity, integrating rejection sampling to embed statistically detectable yet human-imperceptible patterns. It introduces a novel soft-counting mechanism based on sentence-level semantic similarity, markedly enhancing robustness against paraphrasing attacks. Experiments across diverse domains demonstrate that SimMark consistently outperforms existing sentence-level watermarking methods: it achieves higher detection accuracy, superior resistance to paraphrasing, improved sampling efficiency, and preserves text quality without degradation. SimMark thus establishes a new benchmark for reliable provenance tracking of LLM-generated content.

Technology Category

Application Category

πŸ“ Abstract
The rapid proliferation of large language models (LLMs) has created an urgent need for reliable methods to detect whether a text is generated by such models. In this paper, we propose SimMark, a posthoc watermarking algorithm that makes LLMs' outputs traceable without requiring access to the model's internal logits, enabling compatibility with a wide range of LLMs, including API-only models. By leveraging the similarity of semantic sentence embeddings and rejection sampling to impose detectable statistical patterns imperceptible to humans, and employing a soft counting mechanism, SimMark achieves robustness against paraphrasing attacks. Experimental results demonstrate that SimMark sets a new benchmark for robust watermarking of LLM-generated content, surpassing prior sentence-level watermarking techniques in robustness, sampling efficiency, and applicability across diverse domains, all while preserving the text quality.
Problem

Research questions and friction points this paper is trying to address.

Detect LLM-generated texts
Robust watermarking algorithm
Compatibility with diverse LLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic sentence embeddings similarity
Rejection sampling for patterns
Soft counting mechanism robustness
πŸ”Ž Similar Papers
No similar papers found.
A
Amirhossein Dabiriaghdam
Department of ECE, University of British Columbia, Vancouver, BC, Canada
Lele Wang
Lele Wang
University of British Columbia
Information TheoryCoding TheoryGraph Theory