Towards Reliable Proof Generation with LLMs: A Neuro-Symbolic Approach

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Large language models (LLMs) exhibit insufficient reliability in formal reasoning tasks—such as geometric theorem proving—often producing unverifiable or incorrect conclusions. Method: We propose a neuro-symbolic collaborative framework that integrates analogical retrieval to guide initial proof generation by an LLM, followed by real-time symbolic verification using Coq/Lean-style formal checkers; errors are iteratively corrected via verification feedback, establishing a closed-loop “generate–verify–feedback” pipeline. Contribution/Results: This work introduces the first unified mechanism combining retrieval-augmented generation with closed-loop symbolic validation, ensuring verifiability and traceability of generated proofs. Leveraging prompt engineering, RAG, formal theorem verification, and iterative fine-tuning, we realize an end-to-end verifiable proof generation system. Evaluated on the o1 model, our approach improves proof accuracy by 58%–70%. Results demonstrate that analogical guidance and symbolic feedback jointly enhance logical reliability, advancing LLMs from plausible-but-unverified outputs toward provably correct reasoning.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) struggle with formal domains that require rigorous logical deduction and symbolic reasoning, such as mathematical proof generation. We propose a neuro-symbolic approach that combines LLMs' generative strengths with structured components to overcome this challenge. As a proof-of-concept, we focus on geometry problems. Our approach is two-fold: (1) we retrieve analogous problems and use their proofs to guide the LLM, and (2) a formal verifier evaluates the generated proofs and provides feedback, helping the model fix incorrect proofs. We demonstrate that our method significantly improves proof accuracy for OpenAI's o1 model (58%-70% improvement); both analogous problems and the verifier's feedback contribute to these gains. More broadly, shifting to LLMs that generate provably correct conclusions could dramatically improve their reliability, accuracy and consistency, unlocking complex tasks and critical real-world applications that require trustworthiness.

Problem

Research questions and friction points this paper is trying to address.

LLMs struggle with rigorous logical deduction in formal domains

Proposing a neuro-symbolic approach to enhance proof generation accuracy

Combining analogous problem retrieval and formal verification for reliable proofs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines LLMs with symbolic reasoning components

Retrieves analogous problems to guide proof generation

Uses formal verifier to evaluate and correct proofs

🔎 Similar Papers

No similar papers found.