R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models

📅 2024-06-03

🏛️ arXiv.org

📈 Citations: 6

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Existing repository-level code completion methods suffer from insufficient context utilization—particularly weak modeling of cross-file and cross-class dependencies—and narrow benchmark coverage, often limited to single-file snippet completion. To address these limitations, we propose R2C2, a novel framework comprising two core components: (1) R2C2-Enhance, the first retrieval-augmented prompt construction method enabling dynamic, cross-file contextual retrieval; and (2) R2C2-Bench, a high-fidelity perturbation benchmark supporting diverse, realistic scenario simulation. Technically, R2C2 integrates candidate pool construction, semantic-aware retrieval, controllable context perturbation, and multi-stage data partitioning. Evaluated across multiple repository-level code completion benchmarks, R2C2 consistently outperforms state-of-the-art approaches, demonstrating superior long-range dependency modeling and generalization. Our work establishes a new paradigm for deploying large language models in practical software engineering tasks.

Technology Category

Application Category

📝 Abstract

Code completion models have made significant progress in recent years. Recently, repository-level code completion has drawn more attention in modern software development, and several baseline methods and benchmarks have been proposed. However, existing repository-level code completion methods often fall short of fully using the extensive context of a project repository, such as the intricacies of relevant files and class hierarchies. Besides, the existing benchmarks usually focus on limited code completion scenarios, which cannot reflect the repository-level code completion abilities well of existing methods. To address these limitations, we propose the R2C2-Coder to enhance and benchmark the real-world repository-level code completion abilities of code Large Language Models, where the R2C2-Coder includes a code prompt construction method R2C2-Enhance and a well-designed benchmark R2C2-Bench. Specifically, first, in R2C2-Enhance, we first construct the candidate retrieval pool and then assemble the completion prompt by retrieving from the retrieval pool for each completion cursor position. Second, based on R2C2 -Enhance, we can construct a more challenging and diverse R2C2-Bench with training, validation and test splits, where a context perturbation strategy is proposed to simulate the real-world repository-level code completion well. Extensive results on multiple benchmarks demonstrate the effectiveness of our R2C2-Coder.

Problem

Research questions and friction points this paper is trying to address.

Enhancing repository-level code completion using extensive context

Addressing limited scenarios in existing code completion benchmarks

Improving real-world code completion abilities of large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-based prompt construction for context

Benchmark with context perturbation strategy

Enhancing repository-level code completion abilities

🔎 Similar Papers

Retrieval-augmented code completion for local projects using large language models