CXXCrafter: An LLM-Based Agent for Automated C/C++ Open Source Software Building

📅 2025-05-27

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

Automated build automation for C/C++ open-source projects faces core challenges including intricate dependency graphs, heterogeneous build systems (e.g., Make, CMake, Autotools, Bazel), diverse toolchains, and poor error resilience—areas where existing approaches offer limited support. To address these, we propose the first large language model (LLM)-based dynamic interactive build agent framework. It enables end-to-end adaptive build repair via implicit knowledge reasoning and closed-loop environmental feedback. We design a unified abstraction layer to support multiple build systems and integrate context-aware error diagnosis, intelligent retry mechanisms, and recovery strategies grounded in real-time build-state awareness. Evaluated on standard open-source benchmarks, our framework achieves a 78% build success rate; on the Top100 dataset, it outperforms manual builds on three projects and significantly improves overall build coverage. This work establishes a scalable, LLM-driven paradigm for C/C++ ecosystem build automation.

Technology Category

Application Category

📝 Abstract

Project building is pivotal to support various program analysis tasks, such as generating intermediate rep- resentation code for static analysis and preparing binary code for vulnerability reproduction. However, automating the building process for C/C++ projects is a highly complex endeavor, involving tremendous technical challenges, such as intricate dependency management, diverse build systems, varied toolchains, and multifaceted error handling mechanisms. Consequently, building C/C++ projects often proves to be difficult in practice, hindering the progress of downstream applications. Unfortunately, research on facilitating the building of C/C++ projects remains to be inadequate. The emergence of Large Language Models (LLMs) offers promising solutions to automated software building. Trained on extensive corpora, LLMs can help unify diverse build systems through their comprehension capabilities and address complex errors by leveraging tacit knowledge storage. Moreover, LLM-based agents can be systematically designed to dynamically interact with the environment, effectively managing dynamic building issues. Motivated by these opportunities, we first conduct an empirical study to systematically analyze the current challenges in the C/C++ project building process. Particularly, we observe that most popular C/C++ projects encounter an average of five errors when relying solely on the default build systems. Based on our study, we develop an automated build system called CXXCrafter to specifically address the above-mentioned challenges, such as dependency resolution. Our evaluation on open-source software demonstrates that CXXCrafter achieves a success rate of 78% in project building. Specifically, among the Top100 dataset, 72 projects are built successfully by both CXXCrafter and manual efforts, 3 by CXXCrafter only, and 14 manually only. ...

Problem

Research questions and friction points this paper is trying to address.

Automating C/C++ project building is complex due to dependencies and errors

Existing methods struggle with diverse build systems and toolchains

LLM-based agents can dynamically resolve build issues and dependencies

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based agent for automated C/C++ building

Unifies diverse build systems via LLM comprehension

Dynamically interacts to manage build issues

🔎 Similar Papers

From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future