Learning to Generate Formally Verifiable Step-by-Step Logic Reasoning via Structured Formal Intermediaries

📅 2026-03-31

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work addresses the unreliability of intermediate reasoning steps in large language models (LLMs) during multi-step inference, a problem often exacerbated by training objectives that optimize only for final answer correctness. To mitigate this, the authors propose PRoSFI, a novel approach that introduces structured formal intermediate representations, enabling each reasoning step to be verified by a formal prover without requiring the generation of complete formal proofs. By integrating LLMs with structured intermediate representations, formal verification, and a process-reward-based reinforcement learning framework, PRoSFI effectively guides a 7B-parameter model to produce fully verifiable reasoning chains. The method achieves high answer accuracy while substantially enhancing the reliability and trustworthiness of the reasoning process itself.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have recently demonstrated impressive performance on complex, multi-step reasoning tasks, especially when post-trained with outcome-rewarded reinforcement learning Guo et al. 2025. However, it has been observed that outcome rewards often overlook flawed intermediate steps, leading to unreliable reasoning steps even when final answers are correct. To address this unreliable reasoning, we propose PRoSFI (Process Reward over Structured Formal Intermediates), a novel reward method that enhances reasoning reliability without compromising accuracy. Instead of generating formal proofs directly, which is rarely accomplishable for a modest-sized (7B) model, the model outputs structured intermediate steps aligned with its natural language reasoning. Each step is then verified by a formal prover. Only fully validated reasoning chains receive high rewards. The integration of formal verification guides the model towards generating step-by-step machine-checkable proofs, thereby yielding more credible final answers. PRoSFI offers a simple and effective approach to training trustworthy reasoning models.

Problem

Research questions and friction points this paper is trying to address.

formal verification

reasoning reliability

intermediate steps

large language models

machine-checkable proofs

Innovation

Methods, ideas, or system contributions that make the work stand out.

structured formal intermediates

process reward

formal verification