🤖 AI Summary
Code generated by large language models (LLMs) frequently exhibits functional defects, limiting its applicability in safety-critical domains requiring high reliability.
Method: We propose an LLM-driven iterative verification-guided synthesis framework that jointly leverages natural language specifications, ACSL formal annotations, and test cases to generate C programs amenable to formal verification. The framework integrates Frama-C-based static verification, test-driven candidate filtering, and multi-round LLM-based repair, establishing a closed-loop synergy between code generation and formal correctness validation.
Contribution/Results: Unlike conventional LLM-only generation paradigms, our approach explicitly anchors LLM reasoning to provably correct specifications. Evaluated on 15 Codeforces problems, it successfully produced 13 C programs fully verified under ACSL path coverage—demonstrating both effectiveness and practical viability for generating formally verifiable code.
📝 Abstract
Large Language Models (LLMs) have demonstrated impressive capabilities in generating code, yet they often produce programs with flaws or deviations from intended behavior, limiting their suitability for safety-critical applications. To address this limitation, this paper introduces VeCoGen, a novel tool that combines LLMs with formal verification to automate the generation of formally verified C programs. VeCoGen takes a formal specification in ANSI/ISO C Specification Language (ACSL), a natural language specification, and a set of test cases to attempt to generate a program. This program-generation process consists of two steps. First, VeCoGen generates an initial set of candidate programs. Secondly, the tool iteratively improves on previously generated candidates. If a candidate program meets the formal specification, then we are sure the program is correct. We evaluate VeCoGen on 15 problems presented in Codeforces competitions. On these problems, VeCoGen solves 13 problems. This work shows the potential of combining LLMs with formal verification to automate program generation.