🤖 AI Summary
Existing repository-level code completion methods suffer from insufficient context utilization—particularly weak modeling of cross-file and cross-class dependencies—and narrow benchmark coverage, often limited to single-file snippet completion. To address these limitations, we propose R2C2, a novel framework comprising two core components: (1) R2C2-Enhance, the first retrieval-augmented prompt construction method enabling dynamic, cross-file contextual retrieval; and (2) R2C2-Bench, a high-fidelity perturbation benchmark supporting diverse, realistic scenario simulation. Technically, R2C2 integrates candidate pool construction, semantic-aware retrieval, controllable context perturbation, and multi-stage data partitioning. Evaluated across multiple repository-level code completion benchmarks, R2C2 consistently outperforms state-of-the-art approaches, demonstrating superior long-range dependency modeling and generalization. Our work establishes a new paradigm for deploying large language models in practical software engineering tasks.
📝 Abstract
Code completion models have made significant progress in recent years. Recently, repository-level code completion has drawn more attention in modern software development, and several baseline methods and benchmarks have been proposed. However, existing repository-level code completion methods often fall short of fully using the extensive context of a project repository, such as the intricacies of relevant files and class hierarchies. Besides, the existing benchmarks usually focus on limited code completion scenarios, which cannot reflect the repository-level code completion abilities well of existing methods. To address these limitations, we propose the R2C2-Coder to enhance and benchmark the real-world repository-level code completion abilities of code Large Language Models, where the R2C2-Coder includes a code prompt construction method R2C2-Enhance and a well-designed benchmark R2C2-Bench. Specifically, first, in R2C2-Enhance, we first construct the candidate retrieval pool and then assemble the completion prompt by retrieving from the retrieval pool for each completion cursor position. Second, based on R2C2 -Enhance, we can construct a more challenging and diverse R2C2-Bench with training, validation and test splits, where a context perturbation strategy is proposed to simulate the real-world repository-level code completion well. Extensive results on multiple benchmarks demonstrate the effectiveness of our R2C2-Coder.