🤖 AI Summary
To address the challenge of detecting atomicity violations caused by asynchronous interrupts in interrupt-driven critical systems, this paper proposes a novel framework integrating static analysis with a multi-agent large language model (LLM) architecture. A domain-expert agent identifies potential violation patterns, while a judge agent performs formal verification by combining program slicing and operational dependency modeling, augmented with domain-specific knowledge to enhance precision. The approach balances scalability over large state spaces with semantic fidelity to domain constraints. Evaluated on RaceBench 2.1, SV-COMP, and RWIP benchmarks, it achieves 92.3% precision and 86.6% recall, with F1-scores improving by 27.4–118.2% over state-of-the-art methods. This work represents the first deep integration of LLM agents and static analysis for atomicity verification in interrupt-driven systems.
📝 Abstract
Atomicity violations in interrupt-driven programs pose a significant threat to software safety in critical systems. These violations occur when the execution sequence of operations on shared resources is disrupted by asynchronous interrupts. Detecting atomicity violations is challenging due to the vast program state space, application-level code dependencies, and complex domain-specific knowledge. We propose Clover, a hybrid framework that integrates static analysis with large language model (LLM) agents to detect atomicity violations in real-world programs. Clover first performs static analysis to extract critical code snippets and operation information. It then initiates a multi-agent process, where the expert agent leverages domain-specific knowledge to detect atomicity violations, which are subsequently validated by the judge agent. Evaluations on RaceBench 2.1, SV-COMP, and RWIP demonstrate that Clover achieves a precision/recall of 92.3%/86.6%, outperforming existing approaches by 27.4-118.2% on F1-score.