The Feasibility of Topic-Based Watermarking on Academic Peer Reviews

📅 2025-05-27

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

The increasing use of large language models (LLMs) in academic peer review poses a critical challenge: reliably detecting LLM-generated text to safeguard review integrity. Existing detection methods lack robustness and practical deployability in real-world review settings. Method: This paper introduces Topic-Based Watermarking (TBW), the first lightweight, semantics-aware, topic-driven watermarking method specifically designed for authentic peer review scenarios. TBW integrates LLM-derived text embeddings with a robust watermark detection framework, enabling verifiable provenance tracing of LLM-generated content without compromising review quality. Contribution/Results: Evaluated on real conference review data, TBW maintains statistically insignificant degradation in review quality while achieving >92% detection accuracy against diverse paraphrasing attacks. Designed with security, practicality, and regulatory compliance in mind, TBW constitutes the first production-ready watermarking solution tailored to end-to-end academic peer review workflows.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are increasingly integrated into academic workflows, with many conferences and journals permitting their use for tasks such as language refinement and literature summarization. However, their use in peer review remains prohibited due to concerns around confidentiality breaches, hallucinated content, and inconsistent evaluations. As LLM-generated text becomes more indistinguishable from human writing, there is a growing need for reliable attribution mechanisms to preserve the integrity of the review process. In this work, we evaluate topic-based watermarking (TBW), a lightweight, semantic-aware technique designed to embed detectable signals into LLM-generated text. We conduct a comprehensive assessment across multiple LLM configurations, including base, few-shot, and fine-tuned variants, using authentic peer review data from academic conferences. Our results show that TBW maintains review quality relative to non-watermarked outputs, while demonstrating strong robustness to paraphrasing-based evasion. These findings highlight the viability of TBW as a minimally intrusive and practical solution for enforcing LLM usage in peer review.

Problem

Research questions and friction points this paper is trying to address.

Detecting LLM-generated text in peer reviews for integrity

Preventing confidentiality breaches and inconsistent evaluations in reviews

Ensuring watermark robustness against paraphrasing evasion techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Topic-based watermarking for LLM-generated text

Semantic-aware technique for detectable signals

Robust to paraphrasing-based evasion

🔎 Similar Papers

No similar papers found.

Authors to Follow