Enhancing Repository-Level Code Generation with Call Chain-Aware Multi-View Context

📅 2025-07-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing repository-level code generation methods struggle to precisely identify relevant contextual information, and their prompt construction neglects structural code relationships, thereby limiting large language models’ semantic understanding. To address this, we propose RepoScope—a framework that constructs a call-chain-aware structured semantic graph via static analysis to enable multi-perspective context fusion. It introduces a novel four-dimensional context modeling scheme coupled with a call-chain prediction mechanism, and employs a structure-preserving serialization strategy. Crucially, RepoScope operates in a zero-training, single-query setting, significantly enhancing both context completeness and accuracy. Evaluated on the CoderEval and DevEval benchmarks, it achieves up to a 36.35% absolute improvement in pass@1 over state-of-the-art methods, demonstrating its effectiveness and generalizability across diverse coding tasks.

Technology Category

Application Category

📝 Abstract
Repository-level code generation aims to generate code within the context of a specified repository. Existing approaches typically employ retrieval-augmented generation (RAG) techniques to provide LLMs with relevant contextual information extracted from the repository. However, these approaches often struggle with effectively identifying truly relevant contexts that capture the rich semantics of the repository, and their contextual perspectives remains narrow. Moreover, most approaches fail to account for the structural relationships in the retrieved code during prompt construction, hindering the LLM's ability to accurately interpret the context. To address these issues, we propose RepoScope, which leverages call chain-aware multi-view context for repository-level code generation. RepoScope constructs a Repository Structural Semantic Graph (RSSG) and retrieves a comprehensive four-view context, integrating both structural and similarity-based contexts. We propose a novel call chain prediction method that utilizes the repository's structural semantics to improve the identification of callees in the target function. Additionally, we present a structure-preserving serialization algorithm for prompt construction, ensuring the coherence of the context for the LLM. Notably, RepoScope relies solely on static analysis, eliminating the need for additional training or multiple LLM queries, thus ensuring both efficiency and generalizability. Evaluation on widely-used repository-level code generation benchmarks (CoderEval and DevEval) demonstrates that RepoScope outperforms state-of-the-art methods, achieving up to a 36.35% relative improvement in pass@1 scores. Further experiments emphasize RepoScope's potential to improve code generation across different tasks and its ability to integrate effectively with existing approaches.
Problem

Research questions and friction points this paper is trying to address.

Improving repository-level code generation with multi-view context
Enhancing context relevance using call chain-aware structural semantics
Optimizing prompt construction for better LLM interpretation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses call chain-aware multi-view context
Constructs Repository Structural Semantic Graph
Employs structure-preserving serialization algorithm
Y
Yang Liu
State Key Laboratory of Complex & Critical Software Environment, School of Computer Science and Engineering, Beihang University, Beijing, China
L
Li Zhang
State Key Laboratory of Complex & Critical Software Environment, School of Computer Science and Engineering, Beihang University, Beijing, China
F
Fang Liu
State Key Laboratory of Complex & Critical Software Environment, School of Computer Science and Engineering, Beihang University, Beijing, China
Z
Zhuohang Wang
State Key Laboratory of Complex & Critical Software Environment, School of Computer Science and Engineering, Beihang University, Beijing, China
D
Donglin Wei
State Key Laboratory of Complex & Critical Software Environment, School of Computer Science and Engineering, Beihang University, Beijing, China
Z
Zhishuo Yang
State Key Laboratory of Complex & Critical Software Environment, School of Computer Science and Engineering, Beihang University, Beijing, China
Kechi Zhang
Kechi Zhang
Peking University
AI4SE
J
Jia Li
School of Computer Science, Peking University, Beijing, China
Lin Shi
Lin Shi
Beihang University
Software Engineering