🤖 AI Summary
In CI-based co-development of industrial embedded systems, automatic compilation error repair is hindered by the absence of test cases.
Method: This paper proposes the first LLM-based compilation error repair method that operates without test cases and integrates directly into a production CI pipeline. It synergistically leverages four state-of-the-art large language models, trained and evaluated on over 40,000 real-world code commits from industrial practice, enabling end-to-end error localization and patch generation.
Contribution/Results: Evaluated on a benchmark dataset, our approach successfully repairs 63% of compilation errors; 83% of the resulting builds are judged by human experts as semantically reasonable. The average repair time is under eight minutes—substantially outperforming traditional manual debugging, which typically requires several hours. To the best of our knowledge, this work represents the first industrial-grade application and empirical study of LLMs for test-case-free compilation error repair, bridging a critical gap between research and practice.
📝 Abstract
The co-development of hardware and software in industrial embedded systems frequently leads to compilation errors during continuous integration (CI). Automated repair of such failures is promising, but existing techniques rely on test cases, which are not available for non-compilable code.
We employ an automated repair approach for compilation errors driven by large language models (LLMs). Our study encompasses the collection of more than 40000 commits from the product's source code. We assess the performance of an industrial CI system enhanced by four state-of-the-art LLMs, comparing their outcomes with manual corrections provided by human programmers. LLM-equipped CI systems can resolve up to 63 % of the compilation errors in our baseline dataset. Among the fixes associated with successful CI builds, 83 % are deemed reasonable. Moreover, LLMs significantly reduce debugging time, with the majority of successful cases completed within 8 minutes, compared to hours typically required for manual debugging.