Lean-STaR: Learning to Interleave Thinking and Proving

📅 2024-07-14
🏛️ arXiv.org
📈 Citations: 15
Influential: 3
📄 PDF
🤖 AI Summary
Conventional formal theorem proving methods overlook the critical informal reasoning processes inherent in human proof construction, hindering language models’ acquisition of deep deductive reasoning capabilities. Method: This paper explicitly models interpretable informal reasoning as an intermediate step in proof generation, introducing the first “thinking–proving” co-enhancement framework. Built upon Self-Taught Reasoner, it integrates backtracking-based synthetic thinking generation, truth verification via the Lean prover, and multi-round expert-guided iterative fine-tuning. Contribution/Results: The approach achieves 46.3% Pass@64 on miniF2F-test—surpassing the prior state-of-the-art baseline (43.4%)—establishing a new SOTA in the Lean environment. It is the first systematic effort to bridge the gap between formal proof automation and human-like informal reasoning.

Technology Category

Application Category

📝 Abstract
Traditional language model-based theorem proving assumes that by training on a sufficient amount of formal proof data, a model will learn to prove theorems. Our key observation is that a wealth of informal information that is not present in formal proofs can be useful for learning to prove theorems. For instance, humans think through steps of a proof, but this thought process is not visible in the resulting code. We present Lean-STaR, a framework for training language models to produce informal thoughts prior to each step of a proof, thereby boosting the model's theorem-proving capabilities. Lean-STaR uses retrospective ground-truth tactics to generate synthetic thoughts for training the language model. At inference time, the trained model directly generates the thoughts prior to the prediction of the tactics in each proof step. Building on the self-taught reasoner framework, we then apply expert iteration to further fine-tune the model on the correct proofs it samples and verifies using the Lean solver. Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark within the Lean theorem proving environment, significantly outperforming base models ($oldsymbol{43.4% ightarrow 46.3%,}$ Pass@64). We also analyze the impact of the augmented thoughts on various aspects of the theorem proving process, providing insights into their effectiveness.
Problem

Research questions and friction points this paper is trying to address.

Enhance theorem proving by incorporating informal thought processes.
Train models to generate synthetic thoughts for proof steps.
Improve theorem-proving accuracy using expert iteration and Lean solver.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates informal thoughts before proof steps
Uses retrospective ground-truth tactics for training
Applies expert iteration for fine-tuning proofs
🔎 Similar Papers
No similar papers found.
H
Haohan Lin
Institute for Interdisciplinary Information Sciences, Tsinghua University
Zhiqing Sun
Zhiqing Sun
OpenAI
Machine LearningLanguage ModellingAI Alignment
Y
Yiming Yang
Language Technologies Institute, Carnegie Mellon University
S
Sean Welleck
Language Technologies Institute, Carnegie Mellon University