A Combinatorial Identities Benchmark for Theorem Proving via Automated Theorem Generation

📅 2025-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-quality formalized data for combinatorics remains scarce, severely limiting the performance of automated theorem provers (ATPs) on combinatorial identities. Method: We introduce LeanComb—the first Lean-formalized benchmark dedicated to combinatorial identities—and propose ATG4CI, a novel framework integrating self-improving large language models (LLMs) with reinforcement learning–guided Monte Carlo tree search (MCTS) for automated discovery and formal verification of combinatorial identities. Contribution/Results: We establish a new paradigm for theorem generation that synergistically couples LLM self-feedback enhancement with RL-driven tree search, yielding the first large-scale dataset (260K entries) of combinatorial identities with complete, human-verified Lean proofs. Experiments demonstrate substantial improvements in both proof strategy quality and ATP success rates on combinatorial identities, significantly advancing formal reasoning in combinatorics.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have significantly advanced formal theorem proving, yet the scarcity of high-quality training data constrains their capabilities in complex mathematical domains. Combinatorics, a cornerstone of mathematics, provides essential tools for analyzing discrete structures and solving optimization problems. However, its inherent complexity makes it particularly challenging for automated theorem proving (ATP) for combinatorial identities. To address this, we manually construct LeanComb, combinatorial identities benchmark in Lean, which is, to our knowledge, the first formalized theorem proving benchmark built for combinatorial identities. We develop an Automated Theorem Generator for Combinatorial Identities, ATG4CI, which combines candidate tactics suggested by a self-improving large language model with a Reinforcement Learning Tree Search approach for tactic prediction. By utilizing ATG4CI, we generate a LeanComb-Enhanced dataset comprising 260K combinatorial identities theorems, each with a complete formal proof in Lean, and experimental evaluations demonstrate that models trained on this dataset can generate more effective tactics, thereby improving success rates in automated theorem proving for combinatorial identities.
Problem

Research questions and friction points this paper is trying to address.

Addresses scarcity of high-quality training data
Focuses on automated theorem proving in combinatorics
Introduces LeanComb benchmark for combinatorial identities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Manual construction of LeanComb benchmark
ATG4CI combines LLM and RL Tree Search
Generates 260K combinatorial identities theorems
🔎 Similar Papers
No similar papers found.