ExVerus: Verus Proof Repair via Counterexample Reasoning

📅 2026-03-26

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Current large language models treat proof generation in formal verification as a static end-to-end prediction task, which precludes the use of program execution feedback and thereby limits their proving capabilities. This work proposes the first large language model framework that integrates counterexample-guided reasoning: upon verification failure, the system automatically generates and validates concrete counterexamples, then leverages them to guide the model in generalizing inductive invariants for proof repair. By introducing dynamic, behavior-aware counterexample reasoning into large language model–driven formal verification, this approach significantly enhances the accuracy, robustness, and token efficiency of proof generation in Verus, outperforming state-of-the-art prompting strategies.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have shown promising results in automating formal verification. However, existing approaches treat proof generation as a static, end-to-end prediction over source code, relying on limited verifier feedback and lacking access to concrete program behaviors. We present EXVERUS, a counterexample-guided framework that enables LLMs to reason about proofs using behavioral feedback via counterexamples. When a proof fails, EXVERUS automatically generates and validates counterexamples, and then guides the LLM to generalize them into inductive invariants to block these failures. Our evaluation shows that EXVERUS significantly improves proof accuracy, robustness, and token efficiency over the state-of-the-art prompting-based Verus proof generator.

Problem

Research questions and friction points this paper is trying to address.

formal verification

large language models

counterexample reasoning

proof repair

inductive invariants

Innovation

Methods, ideas, or system contributions that make the work stand out.

counterexample-guided reasoning

large language models

formal verification