MathlibLemma: Folklore Lemma Generation and Benchmark for Formal Mathematics

📅 2026-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses a critical gap in the Mathlib library for Lean: the absence of widely used yet informal “folk lemmas” that hinder Lean’s practical utility in everyday mathematical research. To bridge this gap, the authors introduce the first large language model–driven multi-agent system that transforms LLMs from passive users into active contributors of formalized knowledge, autonomously discovering, generating, and verifying folk lemmas. The project establishes MathlibLemma, a benchmark comprising 4,028 type-checked Lean statements, and demonstrates real-world impact by successfully integrating a subset of these lemmas into the official Mathlib repository—marking the first instance of a self-evolving formal mathematical library.

Technology Category

Application Category

📝 Abstract
While the ecosystem of Lean and Mathlib has enjoyed celebrated success in formal mathematical reasoning with the help of large language models (LLMs), the absence of many folklore lemmas in Mathlib remains a persistent barrier that limits Lean's usability as an everyday tool for mathematicians like LaTeX or Maple. To address this, we introduce MathlibLemma, the first LLM-based multi-agent system to automate the discovery and formalization of mathematical folklore lemmas. This framework constitutes our primary contribution, proactively mining the missing connective tissue of mathematics. Its efficacy is demonstrated by the production of a verified library of folklore lemmas, a subset of which has already been formally merged into the latest build of Mathlib, thereby validating the system's real-world utility and alignment with expert standards. Leveraging this pipeline, we further construct the MathlibLemma benchmark, a suite of 4,028 type-checked Lean statements spanning a broad range of mathematical domains. By transforming the role of LLMs from passive consumers to active contributors, this work establishes a constructive methodology for the self-evolution of formal mathematical libraries.
Problem

Research questions and friction points this paper is trying to address.

folklore lemmas
formal mathematics
Mathlib
Lean
mathematical formalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

folklore lemmas
formal mathematics
large language models
multi-agent system
Mathlib
🔎 Similar Papers
No similar papers found.
X
Xinyu Liu
Department of Computer Science, University of Virginia
Z
Zixuan Xie
Department of Computer Science, University of Virginia
A
Amir Moeini
Department of Computer Science, University of Virginia
Claire Chen
Claire Chen
PhD student, Stanford University
contact-rich manipulationrobot learningmulti-modal sensing
S
S. Liu
Department of Computer Science, University of Virginia
Yu Meng
Yu Meng
University of Virginia
Machine LearningLanguage ModelsNatural Language Processing
A
Aidong Zhang
Department of Computer Science, University of Virginia
Shangtong Zhang
Shangtong Zhang
University of Virginia
reinforcement learningstochastic approximation