Structure-Aware Corpus Construction and User-Perception-Aligned Metrics for Large-Language-Model Code Completion

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

This work addresses two key bottlenecks in large language model (LLM)-based code completion: (1) the misalignment between conventional evaluation metrics and developers’ actual preferences, and (2) the lack of structural semantic modeling and cross-module dependency capture in repository-level completion. To tackle these, we propose: (1) user-aware, preference-consistent metrics—LCP and ROUGE-LCP—that significantly improve correlation with human judgments; and (2) SPSR-Graph, a structure-preserving and semantic-reordering data construction method that explicitly models code graph structures and inter-file dependencies. Experiments demonstrate that our metrics achieve substantially higher agreement with developer preferences than BLEU and CodeBLEU; SPSR-Graph improves repository-level completion accuracy by 12.7% and cross-file completion F1-score by 9.3%. This is the first work to quantitatively incorporate user perception into code completion evaluation and to jointly model structured semantics and inter-module dependencies at scale in real-world codebases.

Technology Category

Application Category

📝 Abstract

Code completion technology based on large language model has significantly improved the development efficiency of programmers. However, in practical applications, there remains a gap between current commonly used code completion evaluation metrics and users' actual perception. To address this issue, we propose two evaluation metrics for code completion tasks--LCP and ROUGE-LCP, from the perspective of probabilistic modeling. Furthermore, to tackle the lack of effective structural semantic modeling and cross-module dependency information in LLMs for repository-level code completion scenarios, we propose a data processing method based on a Structure-Preserving and Semantically-Reordered Code Graph (SPSR-Graph). Through theoretical analysis and experimental validation, we demonstrate the superiority of the proposed evaluation metrics in terms of user perception consistency, as well as the effectiveness of the data processing method in enhancing model performance.

Problem

Research questions and friction points this paper is trying to address.

Bridging gap between code completion metrics and user perception

Improving structural semantic modeling for repository-level completion

Enhancing LLM performance via structure-preserving data processing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposed LCP and ROUGE-LCP evaluation metrics

Introduced Structure-Preserving Semantically-Reordered Code Graph

Enhanced user perception consistency and model performance

🔎 Similar Papers

R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models