🤖 AI Summary
Existing automated formalization methods support only whole-proof verification, making fine-grained, sentence-level validation of natural language mathematical proofs infeasible.
Method: We propose a stepwise automated formalization framework that enables sentence-level decomposition of natural language proofs, generation of independent subproofs, and collaborative verification. Our approach integrates large language models’ semantic understanding, structured reasoning prompts, a dynamic subproof segmentation mechanism, and interactive theorem provers (e.g., Lean), augmented by a lightweight human rewriting strategy optimized for stepwise verification.
Contribution/Results: Evaluated on multiple mathematical benchmarks, our method achieves a new state-of-the-art formalization success rate, improves proof efficiency by over 40%, and demonstrates significant robustness to fine-tuned natural language inputs—marking the first work to realize sentence-level formalization and verification of natural language proofs.
📝 Abstract
Interactive theorem provers (ITPs) are powerful tools for the formal verification of mathematical proofs down to the axiom level. However, their lack of a natural language interface remains a significant limitation. Recent advancements in large language models (LLMs) have enhanced the understanding of natural language inputs, paving the way for autoformalization - the process of translating natural language proofs into formal proofs that can be verified. Despite these advancements, existing autoformalization approaches are limited to verifying complete proofs and lack the capability for finer, sentence-level verification. To address this gap, we propose StepProof, a novel autoformalization method designed for granular, step-by-step verification. StepProof breaks down complete proofs into multiple verifiable subproofs, enabling sentence-level verification. Experimental results demonstrate that StepProof significantly improves proof success rates and efficiency compared to traditional methods. Additionally, we found that minor manual adjustments to the natural language proofs, tailoring them for step-level verification, further enhanced StepProof's performance in autoformalization.