Code Copycat Conundrum: Demystifying Repetition in LLM-based Code Generation

📅 2025-04-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Repetition in code generation by large language models (LLMs) occurs across multiple granularities—character-, statement-, and block-level—and remains a pervasive, under-studied issue. Method: This work presents the first systematic empirical study of multi-granularity repetition across 19 state-of-the-art code LLMs and introduces DeRep, a lightweight, interpretable, production-ready rule-based deduplication method. DeRep integrates syntactic structure analysis, semantic similarity assessment, sliding-window pattern detection, and multi-level redundancy filtering. Contribution/Results: DeRep reduces repetition rates by over 90% on rep-3, rep-line, and sim-line metrics, while improving Pass@1 by 208.3%. When integrated with existing deduplication techniques, it further boosts Pass@1 by 53.7%–215.7%, significantly enhancing both conciseness and functional correctness of generated code. The study also establishes the first taxonomy covering 20 distinct repetition patterns.

Technology Category

Application Category

📝 Abstract
Despite recent advances in Large Language Models (LLMs) for code generation, the quality of LLM-generated code still faces significant challenges. One significant issue is code repetition, which refers to the model's tendency to generate structurally redundant code, resulting in inefficiencies and reduced readability. To address this, we conduct the first empirical study to investigate the prevalence and nature of repetition across 19 state-of-the-art code LLMs using three widely-used benchmarks. Our study includes both quantitative and qualitative analyses, revealing that repetition is pervasive and manifests at various granularities and extents, including character, statement, and block levels. We further summarize a taxonomy of 20 repetition patterns. Building on our findings, we propose DeRep, a rule-based technique designed to detect and mitigate repetition in generated code. We evaluate DeRep using both open-source benchmarks and in an industrial setting. Our results demonstrate that DeRep significantly outperforms baselines in reducing repetition (with an average improvements of 91.3%, 93.5%, and 79.9% in rep-3, rep-line, and sim-line metrics) and enhancing code quality (with a Pass@1 increase of 208.3% over greedy search). Furthermore, integrating DeRep improves the performance of existing repetition mitigation methods, with Pass@1 improvements ranging from 53.7% to 215.7%.
Problem

Research questions and friction points this paper is trying to address.

Investigates code repetition in LLM-generated code
Analyzes repetition prevalence across 19 code LLMs
Proposes DeRep to detect and mitigate code repetition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Empirical study on repetition in 19 code LLMs
Taxonomy of 20 repetition patterns identified
Rule-based DeRep technique reduces repetition significantly
M
Ming Liu
Sun Yat-sen University, Zhuhai, China
Juntao Li
Juntao Li
Soochow University
Language ModelsText Generation
Y
Ying Wang
Fudan University, Shanghai, China
Xueying Du
Xueying Du
Fudan University
AI4SE
Qiuyuan Chen
Qiuyuan Chen
Tencent Technology
software engineering
B
Bingxu An
Tencent Technology
Z
Zhao Wei
Tencent Technology
Y
Yong Xu
Tencent Technology
F
Fangming Zou
Tencent Technology
Xin Peng
Xin Peng
East China University of Science and Technology
Artificial IntelligenceMachine LearningComplex Process Modeling
Yiling Lou
Yiling Lou
Fudan University, China
Software EngineeringTestingDebugging