π€ AI Summary
This work addresses the lack of correctness guarantees in existing text-to-code generation methods, which still rely heavily on manual inspection and thus hinder development efficiency. To bridge this gap, the paper introduces verifiable formal assertions into the code generation process for the first time: a large language model simultaneously produces C code and candidate assertions, which are then jointly verified using a bounded model checker. This approach provides partial correctness guarantees and enhances code interpretability. Experiments across 18 programming tasks demonstrate that the method efficiently generates code accompanied by verifiable assertions. Furthermore, a user study involving over 400 participants confirms that these assertions significantly improve developersβ understanding of the generated code.
π Abstract
A fundamental limitation of Text-to-Code is that no guarantee can be obtained about the correctness of the generated code. Therefore, to ensure its correctness, the generated code still has to be reviewed, tested, and maintained by developers. However, parsing through LLM-generated code can be tedious and time-consuming, potentially negating the productivity gains promised by AI-coding tools. To address this challenge, we present Viverra, a system that automatically produces formally verified annotations alongside generated code to aid user's understanding of the generated program. Given a natural-language task description, Viverra prompts an LLM to synthesize a C program together with candidate assertions expressing safety and correctness properties. It then verifies those assertions in a compositional and best-effort manner via a portfolio of bounded model checkers. Evaluation on 18 diverse programming tasks suggests that Viverra can efficiently generate code with verified assertions, and that these assertions improve users' performance on code-comprehension tasks in a user study with more than 400 participants.