Automated Formalization via Conceptual Retrieval-Augmented LLMs

📅 2025-08-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Automated formalization faces two key challenges: hallucinations in formal models (e.g., undefined predicates, symbol misuse) and semantic gaps arising from ambiguous natural-language premises. To address these, we propose CRAMF, a concept-driven retrieval-augmented framework. CRAMF introduces the first methodology for constructing a formal knowledge base grounded in core mathematical concepts, enabling structured organization of 26,000 definitions in Lean 4/Mathlib4. It incorporates context-aware query augmentation and a dual-channel hybrid retrieval–reranking mechanism to handle mathematical polymorphism and ensure high-precision retrieval. Evaluated on miniF2F, ProofNet, and AdvancedMath benchmarks, CRAMF achieves up to a 62.1% absolute improvement in formalization translation accuracy, with an average relative gain of 29.9%. These results demonstrate substantial improvements in both accuracy and robustness of large language models in interactive theorem proving.

Technology Category

Application Category

📝 Abstract
Interactive theorem provers (ITPs) require manual formalization, which is labor-intensive and demands expert knowledge. While automated formalization offers a potential solution, it faces two major challenges: model hallucination (e.g., undefined predicates, symbol misuse, and version incompatibility) and the semantic gap caused by ambiguous or missing premises in natural language descriptions. To address these issues, we propose CRAMF, a Concept-driven Retrieval-Augmented Mathematical Formalization framework. CRAMF enhances LLM-based autoformalization by retrieving formal definitions of core mathematical concepts, providing contextual grounding during code generation. However, applying retrieval-augmented generation (RAG) in this setting is non-trivial due to the lack of structured knowledge bases, the polymorphic nature of mathematical concepts, and the high precision required in formal retrieval. We introduce a framework for automatically constructing a concept-definition knowledge base from Mathlib4, the standard mathematical library for the Lean 4 theorem prover, indexing over 26,000 formal definitions and 1,000+ core mathematical concepts. To address conceptual polymorphism, we propose contextual query augmentation with domain- and application-level signals. In addition, we design a dual-channel hybrid retrieval strategy with reranking to ensure accurate and relevant definition retrieval. Experiments on miniF2F, ProofNet, and our newly proposed AdvancedMath benchmark show that CRAMF can be seamlessly integrated into LLM-based autoformalizers, yielding consistent improvements in translation accuracy, achieving up to 62.1% and an average of 29.9% relative improvement.
Problem

Research questions and friction points this paper is trying to address.

Automated formalization faces model hallucination challenges
Semantic gap exists in natural language descriptions
Lack of structured knowledge bases complicates retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

Concept-driven Retrieval-Augmented Mathematical Formalization framework
Automatically constructing concept-definition knowledge base
Dual-channel hybrid retrieval strategy with reranking
🔎 Similar Papers
No similar papers found.
W
Wangyue Lu
Northeastern University
L
Lun Du
Ant Research Institute, Ant Group
S
Sirui Li
Northeastern University
K
Ke Weng
Northeastern University
H
Haozhe Sun
Northeastern University
H
Hengyu Liu
Department of Computer Science, Aalborg University
M
Minghe Yu
Northeastern University
Tiancheng Zhang
Tiancheng Zhang
Northeastern University, China
user profiledeep learningmachine learning,intelligent education
G
Ge Yu
Northeastern University