Monotonic Reference-Free Refinement for Autoformalization

📅 2026-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of fully automated theorem formalization, which requires simultaneous optimization of formal validity, logical fidelity, mathematical consistency, and syntactic quality—dimensions often addressed in isolation by existing approaches that also typically rely on reference standards. To overcome these limitations, we propose a reference-free, monotonically improving iterative optimization framework that leverages complementary feedback from a theorem prover and a panel of multi-role large language models (LLMs). Our approach introduces a novel response mapping mechanism to guide each LLM role toward targeted refinements and incorporates an acceptance strategy with convergence criteria that guarantee monotonic performance improvement. Experimental results demonstrate that our method achieves 93.44% formal validity and 78.22% overall score on miniF2F, and 44.09% formal validity with 29.79% overall score on ProofNet, establishing new state-of-the-art performance without reference-guided supervision.

Technology Category

Application Category

📝 Abstract
While statement autoformalization has advanced rapidly, full-theorem autoformalization remains largely unexplored. Existing iterative refinement methods in statement autoformalization typicall improve isolated aspects of formalization, such as syntactic correctness, but struggle to jointly optimizing multiple quality dimensions, which is critical for full-theorem autoformalization. We introduce a reference-free iterative monotonic process for full-theorem autoformalization that leverages complementary feedback from theorem provers and LLM-based judges, without access to ground-truth proofs or existing formalizations at inference time. Our approach optimizes a masked composite objective over Formal Validity, Logical Preservation, Mathematical Consistency, and Formal Quality, guided by a responsiveness map that indicates how different LLMs acting as different roles preferentially improve each dimension. We further propose an acceptance policy that guarantees certified monotonic improvement, and provide conditions ensuring convergence and termination. Empirical experiments demonstrate the proposed process enables simultaneous improvement across multiple dimensions, achieving 93.44% formal validity and a 78.22% overall score on miniF2F, and 44.09% formal validity and a 29.79% overall score on ProofNet.
Problem

Research questions and friction points this paper is trying to address.

autoformalization
full-theorem
iterative refinement
multi-dimensional optimization
reference-free
Innovation

Methods, ideas, or system contributions that make the work stand out.

monotonic refinement
reference-free autoformalization
full-theorem formalization
composite objective optimization
LLM-based judging
🔎 Similar Papers
No similar papers found.