A Neurosymbolic Approach to Loop Invariant Generation via Weakest Precondition Reasoning

📅 2025-12-17

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Automated generation of loop invariants remains a critical bottleneck in program verification. This paper introduces a neuro-symbolic collaborative framework that pioneers the deep integration of Hoare logic’s weakest precondition (WP) reasoning into large language model (LLM) inference. Our method employs LLM-driven backward WP derivation coupled with OpenJML-guided, counterexample-based iterative invariant repair, enabling high-reliability invariant synthesis. By closing the verification loop, it unifies formal rigor with data-driven adaptability. Evaluated on 150 Java benchmarks, our approach achieves a 99.5% success rate. Moreover, on a challenging suite of 10 benchmarks—each containing an average of seven nested or parallel loops—it significantly outperforms state-of-the-art methods, demonstrating both effectiveness and scalability.

Technology Category

Application Category

📝 Abstract

Loop invariant generation remains a critical bottleneck in automated program verification. Recent work has begun to explore the use of Large Language Models (LLMs) in this area, yet these approaches tend to lack a reliable and structured methodology, with little reference to existing program verification theory. This paper presents NeuroInv, a neurosymbolic approach to loop invariant generation. NeuroInv comprises two key modules: (1) a neural reasoning module that leverages LLMs and Hoare logic to derive and refine candidate invariants via backward-chaining weakest precondition reasoning, and (2) a verification-guided symbolic module that iteratively repairs invariants using counterexamples from OpenJML. We evaluate NeuroInv on a comprehensive benchmark of 150 Java programs, encompassing single and multiple (sequential) loops, multiple arrays, random branching, and noisy code segments. NeuroInv achieves a $99.5%$ success rate, substantially outperforming the other evaluated approaches. Additionally, we introduce a hard benchmark of $10$ larger multi-loop programs (with an average of $7$ loops each); NeuroInv's performance in this setting demonstrates that it can scale to more complex verification scenarios.

Problem

Research questions and friction points this paper is trying to address.

Generates loop invariants for program verification

Combines neural and symbolic reasoning for reliability

Scales to complex multi-loop and noisy code scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neurosymbolic approach combines LLMs with Hoare logic

Backward-chaining weakest precondition reasoning refines invariants

Verification-guided symbolic repair uses counterexamples from OpenJML

🔎 Similar Papers

No similar papers found.