🤖 AI Summary
This work addresses the formal verification of equilibrium characterizations in the Vlasov–Maxwell–Landau (VML) system governing charged plasma dynamics. The project realizes a fully AI-assisted research pipeline—from conjecture formulation to formal proof—without manual coding, relying solely on human oversight. An AI reasoning model (Gemini DeepThink) generated mathematical proofs, a code-generation agent (Claude Code) translated natural-language arguments into Lean 4, and a specialized theorem prover (Aristotle) automatically discharged 111 lemmas, all validated by the Lean kernel. Completed in ten days at a cost of \$200, this effort marks the first end-to-end AI-driven formalization of a mathematical theory prior to the publication of its corresponding paper. The study also identifies failure modes and effective collaboration strategies for AI in mathematical research. All prompts and code are publicly archived.
📝 Abstract
We present a complete Lean 4 formalization of the equilibrium characterization in the Vlasov-Maxwell-Landau (VML) system, which describes the motion of charged plasma. The project demonstrates the full AI-assisted mathematical research loop: an AI reasoning model (Gemini DeepThink) generated the proof from a conjecture, an agentic coding tool (Claude Code) translated it into Lean from natural-language prompts, a specialized prover (Aristotle) closed 111 lemmas, and the Lean kernel verified the result. A single mathematician supervised the process over 10 days at a cost of \$200, writing zero lines of code.
The entire development process is public: all 229 human prompts, and 213 git commits are archived in the repository. We report detailed lessons on AI failure modes -- hypothesis creep, definition-alignment bugs, agent avoidance behaviors -- and on what worked: the abstract/concrete proof split, adversarial self-review, and the critical role of human review of key definitions and theorem statements. Notably, the formalization was completed before the final draft of the corresponding math paper was finished.