R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models

📅 2024-06-03
🏛️ arXiv.org
📈 Citations: 6
Influential: 0
📄 PDF
🤖 AI Summary
Existing repository-level code completion methods suffer from insufficient context utilization—particularly weak modeling of cross-file and cross-class dependencies—and narrow benchmark coverage, often limited to single-file snippet completion. To address these limitations, we propose R2C2, a novel framework comprising two core components: (1) R2C2-Enhance, the first retrieval-augmented prompt construction method enabling dynamic, cross-file contextual retrieval; and (2) R2C2-Bench, a high-fidelity perturbation benchmark supporting diverse, realistic scenario simulation. Technically, R2C2 integrates candidate pool construction, semantic-aware retrieval, controllable context perturbation, and multi-stage data partitioning. Evaluated across multiple repository-level code completion benchmarks, R2C2 consistently outperforms state-of-the-art approaches, demonstrating superior long-range dependency modeling and generalization. Our work establishes a new paradigm for deploying large language models in practical software engineering tasks.

Technology Category

Application Category

📝 Abstract
Code completion models have made significant progress in recent years. Recently, repository-level code completion has drawn more attention in modern software development, and several baseline methods and benchmarks have been proposed. However, existing repository-level code completion methods often fall short of fully using the extensive context of a project repository, such as the intricacies of relevant files and class hierarchies. Besides, the existing benchmarks usually focus on limited code completion scenarios, which cannot reflect the repository-level code completion abilities well of existing methods. To address these limitations, we propose the R2C2-Coder to enhance and benchmark the real-world repository-level code completion abilities of code Large Language Models, where the R2C2-Coder includes a code prompt construction method R2C2-Enhance and a well-designed benchmark R2C2-Bench. Specifically, first, in R2C2-Enhance, we first construct the candidate retrieval pool and then assemble the completion prompt by retrieving from the retrieval pool for each completion cursor position. Second, based on R2C2 -Enhance, we can construct a more challenging and diverse R2C2-Bench with training, validation and test splits, where a context perturbation strategy is proposed to simulate the real-world repository-level code completion well. Extensive results on multiple benchmarks demonstrate the effectiveness of our R2C2-Coder.
Problem

Research questions and friction points this paper is trying to address.

Enhancing repository-level code completion using extensive context
Addressing limited scenarios in existing code completion benchmarks
Improving real-world code completion abilities of large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-based prompt construction for context
Benchmark with context perturbation strategy
Enhancing repository-level code completion abilities
🔎 Similar Papers
No similar papers found.
Ken Deng
Ken Deng
Kwaipilot Team, Kuaishou Technology
LLMAI4SEAI Agent
J
Jiaheng Liu
Alibaba Group
H
He Zhu
Alibaba Group
C
Congnan Liu
Alibaba Group
J
Jingxin Li
Alibaba Group
Jiakai Wang
Jiakai Wang
Zhongguancun Laboratory
Adversarial examplesTrustworthy AI
P
Peng Zhao
Alibaba Group
C
Chenchen Zhang
Alibaba Group
Y
Yanan Wu
Alibaba Group
X
Xueqiao Yin
Alibaba Group
Yuanxing Zhang
Yuanxing Zhang
Kuaishou Technology
Recommender SystemLarge Language ModelVideo Understanding
W
Wenbo Su
Alibaba Group
B
Bangyu Xiang
Alibaba Group
Tiezheng Ge
Tiezheng Ge
Senior staff algorithm engineer, Alimama, Alibaba Group
Computer VisionAIGCRecommender Systems
B
Bo Zheng
Alibaba Group