🤖 AI Summary
This work investigates the capacity of large language models (LLMs) to autonomously discover novel mathematical theorems in formal mathematics—beyond merely verifying existing statements.
Method: We propose a “conjecture–proof loop” framework wherein an LLM generates formalizable mathematical conjectures in Lean 4 and automatically constructs their formal proofs. Crucially, we introduce a context-learning mechanism: previously generated theorems and their Lean 4 proofs are incorporated as in-context examples to dynamically increase the difficulty and depth of subsequent conjectures—without any parameter fine-tuning.
Contribution/Results: Our approach enables progressive, parameter-free theorem discovery. It successfully reconstructs several classical mathematical results not previously formalized in Lean 4 and, for the first time, autonomously discovers and verifies a new theorem whose proof critically depends on the context-learning mechanism. This demonstrates a viable pathway toward AI-driven, original mathematical research grounded in formal verification.
📝 Abstract
Large Language Models have demonstrated significant promise in formal theorem proving. However, previous works mainly focus on solving existing problems. In this paper, we focus on the ability of LLMs to find novel theorems. We propose Conjecturing-Proving Loop pipeline for automatically generating mathematical conjectures and proving them in Lean 4 format. A feature of our approach is that we generate and prove further conjectures with context including previously generated theorems and their proofs, which enables the generation of more difficult proofs by in-context learning of proof strategies without changing parameters of LLMs. We demonstrated that our framework rediscovered theorems with verification, which were published in past mathematical papers and have not yet formalized. Moreover, at least one of these theorems could not be proved by the LLM without in-context learning, even in natural language, which means that in-context learning was effective for neural theorem proving. The source code is available at https://github.com/auto-res/ConjecturingProvingLoop.