Adaptive Proof Refinement with LLM-Guided Strategy Selection

📅 2025-10-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In formal verification, LLM-generated initial proofs frequently contain errors, and existing correction methods rely on static, fixed strategies that lack adaptability to diverse error types, hindering automation and scalability. To address this, we propose Adapt, the first framework leveraging LLM-driven dynamic strategy selection: it adaptively schedules multiple correction strategies based on real-time proof states, contextual information, and fine-grained error diagnostics, enabling closed-loop optimization. Adapt supports cross-model generalization without manual strategy configuration. Ablation studies confirm the necessity and efficacy of each component. Evaluated on two mainstream theorem-proving benchmarks—MiniF2F and ProofNet—Adapt achieves absolute improvements of 16.63% and 18.58% in theorem-proving success rate over the strongest baselines, respectively. These results demonstrate substantial gains in robustness and practical utility of LLMs for formal verification.

Technology Category

Application Category

📝 Abstract
Formal verification via theorem proving enables the expressive specification and rigorous proof of software correctness, but it is difficult to scale due to the significant manual effort and expertise required. While Large Language Models (LLMs) show potential in proof generation, they frequently produce incorrect proofs on the first attempt and require additional strategies for iterative refinement. However, existing approaches employ fixed refinement strategies and cannot dynamically choose an effective strategy based on the particular issues in a generated proof, which limits their performance. To overcome this limitation, we introduce Adapt, a novel proof refinement framework that leverages an LLM-guided decision-maker to dynamically select a suitable refinement strategy according to the state of the proof assistant and available context of an incorrect proof. We evaluate Adapt on two benchmarks against four existing methods and find that it significantly outperforms the best baseline on both by proving 16.63% and 18.58% more theorems, respectively. Furthermore, we demonstrate Adapt's generalizability by evaluating it across five different LLMs. We also conduct ablation studies to measure the contribution of each component and compare the trade-offs of alternative decision-maker designs.
Problem

Research questions and friction points this paper is trying to address.

Dynamically selecting refinement strategies for LLM-generated incorrect proofs
Overcoming limitations of fixed refinement approaches in theorem proving
Adaptive framework leveraging LLM guidance for proof verification improvement
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided decision-maker dynamically selects refinement strategies
Adaptive proof refinement framework based on proof state
Context-aware strategy selection for incorrect proof correction
🔎 Similar Papers
No similar papers found.
M
Minghai Lu
Purdue University, West Lafayette, IN, USA
Z
Zhe Zhou
Purdue University, West Lafayette, IN, USA
Danning Xie
Danning Xie
Purdue University
software engineering
Songlin Jia
Songlin Jia
Purdue University
Programming Languages
Benjamin Delaware
Benjamin Delaware
Assistant Professor, Purdue University
Programming LanguagesFormal Methods
T
Tianyi Zhang
Purdue University, West Lafayette, IN, USA