Co-developed Goedel-Prover-V2, the strongest open-source theorem proving model to date, doubling SOTA Pass@32 performance on PutnamBench with a 20x smaller model.
Introduced Verina: a high-quality benchmark for verifiable code generation in Lean (joint generation of code, specifications, and proofs).
Developed Lean Finder: a semantic search engine for Lean and mathlib that understands mathematicians’ intents.
Proposed ProofOptimizer: a system that automatically shortens formal proofs without human demonstrations, reducing proof length by up to 87% on MiniF2F and 57% on PutnamBench while preserving correctness.
Co-authored the position paper 'Formal Mathematical Reasoning: A New Frontier in AI', accepted to ICML 2025 (Spotlight) and a separate version to Communications of the ACM.
Led or contributed to key projects and datasets: CoqGym, LeanDojo, LIPS, Goedel-Prover, LeanEuclid, Verina, Lean Copilot, SciInstruct, NLProofS, MetaQNL, etc.
Multiple preprints published in 2025 on formal reasoning, automated theorem proving, and verifiable code generation.