Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience

๐Ÿ“… 2025-12-19
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the high computational cost and limited generalization of large language models (LLMs) in undergraduate- and graduate-level theorem proving within formal systems like Lean, this paper proposes an agent-based reinforcement learning paradigm for experience-driven learning. The method leverages high-quality formal feedback as supervisory signals and introduces a natural-languageโ€“formal-language co-guided test-time scaling (TTS) workflow to enable efficient experience accumulation and policy optimization under constrained compute. Key innovations include the first formal-feedback-driven continual experience learning mechanism and a cross-modal alignment-enhanced TTS inference framework. Experiments demonstrate state-of-the-art performance: 88%, 80%, and 33% accuracy on PutnamBench, Fate-H, and Fate-X, respectively; notably, the approach solves 11 of the 12 problems from Putnam 2025 within nine hours.

Technology Category

Application Category

๐Ÿ“ Abstract
Large language models have recently made significant progress to generate rigorous mathematical proofs. In contrast, utilizing LLMs for theorem proving in formal languages (such as Lean) remains challenging and computationally expensive, particularly when addressing problems at the undergraduate level and beyond. In this work, we present extbf{Seed-Prover 1.5}, a formal theorem-proving model trained via large-scale agentic reinforcement learning, alongside an efficient test-time scaling (TTS) workflow. Through extensive interactions with Lean and other tools, the model continuously accumulates experience during the RL process, substantially enhancing the capability and efficiency of formal theorem proving. Furthermore, leveraging recent advancements in natural language proving, our TTS workflow efficiently bridges the gap between natural and formal languages. Compared to state-of-the-art methods, Seed-Prover 1.5 achieves superior performance with a smaller compute budget. It solves extbf{88% of PutnamBench} (undergraduate-level), extbf{80% of Fate-H} (graduate-level), and extbf{33% of Fate-X} (PhD-level) problems. Notably, using our system, we solved extbf{11 out of 12 problems} from Putnam 2025 within 9 hours. Our findings suggest that scaling learning from experience, driven by high-quality formal feedback, holds immense potential for the future of formal mathematical reasoning.
Problem

Research questions and friction points this paper is trying to address.

Enhances formal theorem proving efficiency with agentic reinforcement learning
Bridges natural and formal language gaps via test-time scaling workflow
Solves undergraduate to PhD-level problems with reduced computational cost
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale agentic reinforcement learning for formal theorem proving
Efficient test-time scaling workflow bridging natural and formal languages
Continuous experience accumulation through interaction with Lean tools
๐Ÿ”Ž Similar Papers
No similar papers found.
Jiangjie Chen
Jiangjie Chen
ByteDance Seed
NLPMachine ReasoningLarge Language ModelsAutonomous Agent
Wenxiang Chen
Wenxiang Chen
Fudan University
LLM reasoningLLM-based agent
Jiacheng Du
Jiacheng Du
Zhejiang University
Trustworthy AI
J
Jinyi Hu
ByteDance Seed AI4Math
Z
Zhicheng Jiang
ByteDance Seed AI4Math
A
Allan Jie
ByteDance Seed AI4Math
X
Xiaoran Jin
ByteDance Seed AI4Math
Xing Jin
Xing Jin
Phd Candidate of Computer Science, Syracuse University
Mobile SecurityWeb SecurityData Mining
C
Chenggang Li
ByteDance Seed AI4Math
Wenlei Shi
Wenlei Shi
Microsoft Research Asia
reinforcement learningmachine learning
Z
Zhihong Wang
ByteDance Seed AI4Math
M
Mingxuan Wang
ByteDance Seed AI4Math
C
Chenrui Wei
ByteDance Seed AI4Math
S
Shufa Wei
ByteDance Seed AI4Math
Huajian Xin
Huajian Xin
University of Edinburgh & ByteDance Seed
LLMs for theorem proving
F
Fan Yang
ByteDance Seed AI4Math
Weihao Gao
Weihao Gao
Moonshot AI
Machine LearningDeep LearningInformation Theory
Z
Zheng Yuan
ByteDance Seed AI4Math
T
Tianyang Zhan
ByteDance Seed AI4Math
Zeyu Zheng
Zeyu Zheng
DeepMind
artificial intelligencemachine learningreinforcement learningdeep learning
T
Tianxi Zhou
ByteDance Seed AI4Math
T
Thomas Hanwen Zhu
ByteDance Seed AI4Math