๐ค AI Summary
To address the lack of efficient, general-purpose automation tools for theorem proving in Lean, this paper introduces LeanHammerโthe first end-to-end, general-purpose hammer framework tailored for dependent type theory. Methodologically, it integrates neural premise selection, symbolic proof search, and type-safe proof reconstruction within a unified architecture grounded in dependent type semantics. Its key contributions are: (1) a context-aware neural premise selection mechanism that dynamically adapts to user-provided hypotheses and goals; (2) the first deep synergy between neural retrieval, symbolic search, and type-safe proof reconstruction; and (3) a holistic integration of neural retrieval, symbolic reasoning, proof reconstruction, and semantic modeling of dependent types. Experiments demonstrate a substantial improvement in premise selection accuracy and a 21% increase in the number of solved goals over state-of-the-art approaches. Moreover, LeanHammer exhibits strong generalization across diverse domains, including mathematical theorem proving and program verification.
๐ Abstract
Neural methods are transforming automated reasoning for proof assistants, yet integrating these advances into practical verification workflows remains challenging. Hammers are tools that interface with external automatic theorem provers to automate tedious reasoning steps. They have dramatically improved productivity in proof assistants, but the Lean proof assistant still does not have a hammer despite its growing popularity. We present LeanHammer, the first end-to-end domain-general hammer for Lean, built on a novel neural premise selection system for a hammer in dependent type theory. Unlike existing Lean premise selectors, our approach dynamically adapts to user-specific contexts and combines with symbolic proof search and reconstruction to create a practical hammer. With comprehensive evaluations, we show that our premise selector enables LeanHammer to solve 21% more goals relative to existing premise selectors, and generalize well to diverse domains. Our work bridges the gap between neural retrieval and symbolic reasoning, making formal verification more accessible to researchers and practitioners.