CITYWALK: Enhancing LLM-Based C++ Unit Test Generation via Project-Dependency Awareness and Language-Specific Knowledge

📅 2025-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automated unit test generation for C++ remains challenging due to language-specific features—including pointers, templates, and virtual functions—as well as strong cross-file dependencies. Method: This paper proposes the first retrieval-augmented generation framework that integrates project-level fine-grained dependency graphs with domain-specific C++ knowledge (API documentation and expert insights). It constructs a cross-file dependency model via static analysis and distills domain knowledge into GPT-4o’s prompting process, overcoming LLMs’ limited support for compiled languages. Results: Evaluated on eight mainstream C++ projects, our approach achieves a 32.7% improvement in test pass rate and 68.4% line coverage—significantly outperforming state-of-the-art methods. Core contribution: We introduce the first C++-aware, dependency-driven prompting paradigm, enabling joint optimization of test executability and coverage.

Technology Category

Application Category

📝 Abstract
Unit testing plays a pivotal role in the software development lifecycle, as it ensures code quality. However, writing high-quality unit tests remains a time-consuming task for developers in practice. More recently, the application of large language models (LLMs) in automated unit test generation has demonstrated promising results. Existing approaches primarily focus on interpreted programming languages (e.g., Java), while mature solutions tailored to compiled programming languages like C++ are yet to be explored. The intricate language features of C++, such as pointers, templates, and virtual functions, pose particular challenges for LLMs in generating both executable and high-coverage unit tests. To tackle the aforementioned problems, this paper introduces CITYWALK, a novel LLM-based framework for C++ unit test generation. CITYWALK enhances LLMs by providing a comprehensive understanding of the dependency relationships within the project under test via program analysis. Furthermore, CITYWALK incorporates language-specific knowledge about C++ derived from project documentation and empirical observations, significantly improving the correctness of the LLM-generated unit tests. We implement CITYWALK by employing the widely popular LLM GPT-4o. The experimental results show that CITYWALK outperforms current state-of-the-art approaches on a collection of eight popular C++ projects. Our findings demonstrate the effectiveness of CITYWALK in generating high-quality C++ unit tests.
Problem

Research questions and friction points this paper is trying to address.

C++
Unit Testing
Large Language Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

CITYWALK
GPT-4o
C++ Unit Testing
🔎 Similar Papers
No similar papers found.
Y
Yuwei Zhang
Key Laboratory of System Software (Chinese Academy of Sciences), Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences, Beijing, China
Q
Qingyuan Lu
Key Laboratory of System Software (Chinese Academy of Sciences), Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences, Beijing, China
K
Kai Liu
Shanghai Stock Exchange Technology Co., Ltd., China
Wensheng Dou
Wensheng Dou
Professor, Institute of Software Chinese Academy of Sciences (ISCAS)
software analysis and testingdatabase systemsdistributed systemsspreadsheet
Jiaxin Zhu
Jiaxin Zhu
Institute of Software, Chinese Academy of Sciences
software engineeringmining software repositories
Li Qian
Li Qian
University of Michigan
Database Usability
C
Chunxi Zhang
Shanghai Stock Exchange Technology Co., Ltd., China
Z
Zheng Lin
Shanghai Stock Exchange Technology Co., Ltd., China
J
Jun Wei
Key Laboratory of System Software (Chinese Academy of Sciences), Institute of Software, Chinese Academy of Sciences; University of Chinese Academy of Sciences, Beijing; Nanjing Institute of Software Technology; University of Chinese Academy of Sciences, Nanjing, China