A Neurosymbolic Approach to Loop Invariant Generation via Weakest Precondition Reasoning

📅 2025-12-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automated generation of loop invariants remains a critical bottleneck in program verification. This paper introduces a neuro-symbolic collaborative framework that pioneers the deep integration of Hoare logic’s weakest precondition (WP) reasoning into large language model (LLM) inference. Our method employs LLM-driven backward WP derivation coupled with OpenJML-guided, counterexample-based iterative invariant repair, enabling high-reliability invariant synthesis. By closing the verification loop, it unifies formal rigor with data-driven adaptability. Evaluated on 150 Java benchmarks, our approach achieves a 99.5% success rate. Moreover, on a challenging suite of 10 benchmarks—each containing an average of seven nested or parallel loops—it significantly outperforms state-of-the-art methods, demonstrating both effectiveness and scalability.

Technology Category

Application Category

📝 Abstract
Loop invariant generation remains a critical bottleneck in automated program verification. Recent work has begun to explore the use of Large Language Models (LLMs) in this area, yet these approaches tend to lack a reliable and structured methodology, with little reference to existing program verification theory. This paper presents NeuroInv, a neurosymbolic approach to loop invariant generation. NeuroInv comprises two key modules: (1) a neural reasoning module that leverages LLMs and Hoare logic to derive and refine candidate invariants via backward-chaining weakest precondition reasoning, and (2) a verification-guided symbolic module that iteratively repairs invariants using counterexamples from OpenJML. We evaluate NeuroInv on a comprehensive benchmark of 150 Java programs, encompassing single and multiple (sequential) loops, multiple arrays, random branching, and noisy code segments. NeuroInv achieves a $99.5%$ success rate, substantially outperforming the other evaluated approaches. Additionally, we introduce a hard benchmark of $10$ larger multi-loop programs (with an average of $7$ loops each); NeuroInv's performance in this setting demonstrates that it can scale to more complex verification scenarios.
Problem

Research questions and friction points this paper is trying to address.

Generates loop invariants for program verification
Combines neural and symbolic reasoning for reliability
Scales to complex multi-loop and noisy code scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neurosymbolic approach combines LLMs with Hoare logic
Backward-chaining weakest precondition reasoning refines invariants
Verification-guided symbolic repair uses counterexamples from OpenJML
🔎 Similar Papers
No similar papers found.