๐ค AI Summary
To address the challenge of tracing and detecting code generated by large language models (LLMs), this paper proposes a reinforcement learningโbased adaptive code watermarking framework. The method embeds statistically detectable yet semantically imperceptible implicit watermarks into syntactically valid code, while strictly preserving functional correctness via a policy model that guides token-level selection. A novel composite reward function integrates program execution feedback with watermark signal fidelity, and Gumbel Top-k reparameterization enables end-to-end gradient optimization over discrete token decisions. Experiments across multiple benchmark datasets demonstrate significant improvements over state-of-the-art approaches: watermark detection accuracy increases by 12.6%, functional preservation exceeds 99.3%, and downstream task performance degradation remains negligible.
๐ Abstract
The need for detecting LLM-generated code necessitates watermarking systems capable of operating within its highly structured and syntactically constrained environment. To address this, we introduce CodeTracer, an innovative adaptive code watermarking framework underpinned by a novel reinforcement learning training paradigm. At its core, CodeTracer features a policy-driven approach that utilizes a parameterized model to intelligently bias token choices during next-token prediction. This strategy ensures that embedded watermarks maintain code functionality while exhibiting subtle yet statistically detectable deviations from typical token distributions. To facilitate policy learning, we devise a comprehensive reward system that seamlessly integrates execution feedback with watermark embedding signals, balancing process-level and outcome-level rewards. Additionally, we employ Gumbel Top-k reparameterization to enable gradient-based optimization of discrete watermarking decisions. Extensive comparative evaluations demonstrate CodeTracer's significant superiority over state-of-the-art baselines in both watermark detectability and the preservation of generated code's functionality.