🤖 AI Summary
This work addresses the unresolved security of pseudorandom error-correcting codes (PRCs)—a novel cryptographic primitive for AI watermarking—by conducting the first systematic cryptanalysis. We propose three practical attacks: (1) statistical distinguishability combined with algebraic decoding to break undetectability; (2) robustness-breaking attacks leveraging large-model output feature modeling; and (3) weak-key–driven key-recovery analysis. Evaluated on real-world models—including DeepSeek and Stable Diffusion—our attacks universally compromise PRC security guarantees across all tested parameters, achieving near-100% watermark detection rates. We further design an efficient detection attack with time complexity 2²² and propose a revised key-generation scheme resistant to weak-key exploitation. Our results demonstrate that existing PRC-based watermarking schemes fail to meet the 128-bit security requirement, revealing critical cryptographic vulnerabilities in current AI watermarking primitives.
📝 Abstract
Pseudorandom error-correcting codes (PRC) is a novel cryptographic primitive proposed at CRYPTO 2024. Due to the dual capability of pseudorandomness and error correction, PRC has been recognized as a promising foundational component for watermarking AI-generated content. However, the security of PRC has not been thoroughly analyzed, especially with concrete parameters or even in the face of cryptographic attacks. To fill this gap, we present the first cryptanalysis of PRC. We first propose three attacks to challenge the undetectability and robustness assumptions of PRC. Among them, two attacks aim to distinguish PRC-based codewords from plain vectors, and one attack aims to compromise the decoding process of PRC. Our attacks successfully undermine the claimed security guarantees across all parameter configurations. Notably, our attack can detect the presence of a watermark with overwhelming probability at a cost of $2^{22}$ operations. We also validate our approach by attacking real-world large generative models such as DeepSeek and Stable Diffusion. To mitigate our attacks, we further propose three defenses to enhance the security of PRC, including parameter suggestions, implementation suggestions, and constructing a revised key generation algorithm. Our proposed revised key generation function effectively prevents the occurrence of weak keys. However, we highlight that the current PRC-based watermarking scheme still cannot achieve a 128-bit security under our parameter suggestions due to the inherent configurations of large generative models, such as the maximum output length of large language models.