Hierarchical Attention Generates Better Proofs

📅 2025-04-27

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Large language models (LLMs) struggle to model the hierarchical structure of mathematical reasoning in formal theorem proving. Method: We propose Hierarchical Attention Regularization (HAR), the first approach to explicitly encode a five-level mathematical reasoning hierarchy—proposition → lemma → step → subgoal → atomic operation—into the Transformer attention mechanism via structural alignment constraints. HAR integrates mathematically grounded attention regularization with formal proof fine-tuning on miniF2F and ProofNet. Contribution/Results: On miniF2F, HAR improves proof success rate by 2.05% and reduces average proof steps by 23.81%; on ProofNet, it achieves a 1.69% gain in success rate and a 16.50% reduction in proof steps. These results demonstrate substantially enhanced modeling of abstract reasoning structures and improved generation efficiency for formal proofs.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have shown promise in formal theorem proving, but their token-level processing often fails to capture the inherent hierarchical nature of mathematical proofs. We introduce extbf{Hierarchical Attention}, a regularization method that aligns LLMs' attention mechanisms with mathematical reasoning structures. Our approach establishes a five-level hierarchy from foundational elements to high-level concepts, ensuring structured information flow in proof generation. Experiments demonstrate that our method improves proof success rates by 2.05% on miniF2F and 1.69% on ProofNet while reducing proof complexity by 23.81% and 16.50% respectively. The code is available at https://github.com/Car-pe/HAGBP.

Problem

Research questions and friction points this paper is trying to address.

Aligns LLM attention with mathematical proof structures

Improves proof success rates on miniF2F and ProofNet

Reduces proof complexity by significant percentages

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Attention aligns LLMs with math structures

Five-level hierarchy ensures structured proof generation

Improves success rates and reduces proof complexity

🔎 Similar Papers

No similar papers found.