Optimizing Token Choice for Code Watermarking: A RL Approach

๐Ÿ“… 2025-08-16
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the challenge of tracing and detecting code generated by large language models (LLMs), this paper proposes a reinforcement learningโ€“based adaptive code watermarking framework. The method embeds statistically detectable yet semantically imperceptible implicit watermarks into syntactically valid code, while strictly preserving functional correctness via a policy model that guides token-level selection. A novel composite reward function integrates program execution feedback with watermark signal fidelity, and Gumbel Top-k reparameterization enables end-to-end gradient optimization over discrete token decisions. Experiments across multiple benchmark datasets demonstrate significant improvements over state-of-the-art approaches: watermark detection accuracy increases by 12.6%, functional preservation exceeds 99.3%, and downstream task performance degradation remains negligible.

Technology Category

Application Category

๐Ÿ“ Abstract
The need for detecting LLM-generated code necessitates watermarking systems capable of operating within its highly structured and syntactically constrained environment. To address this, we introduce CodeTracer, an innovative adaptive code watermarking framework underpinned by a novel reinforcement learning training paradigm. At its core, CodeTracer features a policy-driven approach that utilizes a parameterized model to intelligently bias token choices during next-token prediction. This strategy ensures that embedded watermarks maintain code functionality while exhibiting subtle yet statistically detectable deviations from typical token distributions. To facilitate policy learning, we devise a comprehensive reward system that seamlessly integrates execution feedback with watermark embedding signals, balancing process-level and outcome-level rewards. Additionally, we employ Gumbel Top-k reparameterization to enable gradient-based optimization of discrete watermarking decisions. Extensive comparative evaluations demonstrate CodeTracer's significant superiority over state-of-the-art baselines in both watermark detectability and the preservation of generated code's functionality.
Problem

Research questions and friction points this paper is trying to address.

Detecting LLM-generated code via watermarking in constrained environments
Embedding watermarks without compromising code functionality
Balancing watermark detectability with code quality preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning training paradigm for watermarking
Policy-driven model biases token choices intelligently
Gumbel Top-k reparameterization enables gradient-based optimization
๐Ÿ”Ž Similar Papers
No similar papers found.