🤖 AI Summary
This work proposes PhantomRun, a novel framework that leverages large language models (LLMs) to automatically repair compilation failures in continuous integration (CI) pipelines for embedded open-source software—a domain often plagued by hardware dependencies, syntax errors, and build script issues that incur substantial debugging overhead. PhantomRun integrates build logs, source code, historical fixes, and error diagnostics to generate and validate repair patches. The framework incorporates an adapter layer to ensure compatibility with diverse CI platforms such as GitHub Actions and GitLab CI, as well as multiple build systems. Experimental evaluation on four widely used embedded software projects demonstrates that PhantomRun successfully resolves 45% of CI compilation failures, thereby establishing the effectiveness and practicality of LLMs in this challenging context.
📝 Abstract
Continuous Integration (CI) pipelines for embedded software sometimes fail during compilation, consuming significant developer time for debugging. We study four major open-source embedded system projects, spanning over 4000 build failures from the project's CI runs. We find that hardware dependencies account for the majority of compilation failures, followed by syntax errors and build-script issues. Most repairs need relatively small changes, making automated repair potentially suitable as long as the diverse setups and lack of test data can be handled. In this paper, we present PhantomRun, an automated framework that leverages large language models (LLMs) to generate and validate fixes for CI compilation failures. The framework addresses the challenge of diverse build infrastructures and tool chains across embedded system projects by providing an adaptation layer for GitHub Actions and GitLab CI and four different build systems. PhantomRun utilizes build logs, source code, historical fixes, and compiler error messages to synthesize fixes using LLMs. Our evaluations show that PhantomRun successfully repairs up to 45% of CI compilation failures across the targeted projects, demonstrating the viability of LLM-based repairs for embedded-system CI pipelines.