🤖 AI Summary
This work proposes a verifiable intelligent tutoring system for mathematical proofs that synergistically combines the strengths of large language models (LLMs) and formal proof assistants like Lean. Addressing the limitations of LLMs—prone to errors in mathematical reasoning—and the steep learning curve associated with formal provers, the system integrates three core components: automatic formalization, proof checking, next-step proof generation, and natural language feedback. It uniquely bridges the interactive flexibility of LLMs with the rigorous verifiability of theorem provers, achieving both user-friendliness and reliability. The authors implement a prototype system, LeanTutor, and introduce PeanoBench, a bilingual dataset comprising 371 proofs in Peano arithmetic, to evaluate their approach. This work establishes a new paradigm for verifiable AI-driven tutoring in formal mathematics.
📝 Abstract
This paper considers the development of an AI-based provably-correct mathematical proof tutor. While Large Language Models (LLMs) allow seamless communication in natural language, they are error prone. Theorem provers such as Lean allow for provable-correctness, but these are hard for students to learn. We present a proof-of-concept system (LeanTutor) by combining the complementary strengths of LLMs and theorem provers. LeanTutor is composed of three modules: (i) an autoformalizer/proof-checker, (ii) a next-step generator, and (iii) a natural language feedback generator. To evaluate the system, we introduce PeanoBench, a dataset of 371 Peano Arithmetic proofs in human-written natural language and formal language, derived from the Natural Numbers Game.