Beyond pass@k: Redundancy-Aware RLVR for Multi-Sample Code Generation

📅 2026-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
While existing reinforcement learning–based code generation approaches improve executable correctness, they overlook implementation-level redundancy across multiple samples, leading to insufficient diversity and suboptimal performance under limited sampling budgets. This work presents the first systematic quantification of this redundancy issue and introduces an anti-redundancy reward mechanism based on JPlag similarity, integrated directly into a reinforcement learning verifier (RLVR) to explicitly optimize generation diversity. Experiments across three models and three benchmarks demonstrate that our method significantly reduces redundancy and consistently enhances executable performance under constrained sampling, matching or even surpassing specialized Pass@k-aware methods. These results challenge the conventional RLVR paradigm that focuses solely on correctness, highlighting the critical role of diversity in code generation.
📝 Abstract
LLMs for code generation are commonly evaluated in repeated-sampling settings using Pass@k, where multiple candidate programs are executed against unit tests under a finite sampling budget. While recent verifier-based reinforcement learning (RLVR) methods improve executable correctness, how these objectives affect redundancy among sampled programs remains poorly understood. In this work, we study implementation-level redundancy in code generation using JPlag, a plagiarism-detection system for code. Across models and benchmarks, we show that correctness-only RLVR often concentrates generations around repeated implementations, whereas Pass@k-aware objectives maintain lower redundancy and improve larger-budget performance. Motivated by these observations, we augment RLVR with direct anti-redundancy rewards based on JPlag similarity. Across 3 models and 3 benchmarks, discouraging near-duplicate generations reliably improves finite-budget executable performance, often matching or outperforming specialized Pass@k-aware objectives.
Problem

Research questions and friction points this paper is trying to address.

code generation
redundancy
LLMs
Pass@k
JPlag
Innovation

Methods, ideas, or system contributions that make the work stand out.

Redundancy-Aware RLVR
JPlag similarity
multi-sample code generation
anti-redundancy reward
Pass@k