Towards Autonomous Mathematics Research

📅 2026-02-10

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

This work aims to advance artificial intelligence from competition-level problem solving toward autonomous mathematical research, encompassing literature comprehension, long-horizon proof construction, and open-problem resolution. We introduce Aletheia, a mathematical research agent built upon the Gemini Deep Think enhanced reasoning architecture, integrating law-of-thought expansion at inference time, dense tool invocation mechanisms, and an end-to-end pipeline for natural-language proof generation and verification. Aletheia produces the first fully AI-generated mathematics research paper, establishes quantitative metrics for assessing autonomy and novelty of results, and introduces “human-AI interaction cards” to enhance collaborative transparency. The system autonomously generated a paper on structural constants of eigenweights in arithmetic geometry, co-proved bounds on independent sets in particle systems, and resolved four open problems from Bloom’s Erdős conjecture database.

Technology Category

Application Category

📝 Abstract

Recent advances in foundational models have yielded reasoning systems capable of achieving a gold-medal standard at the International Mathematical Olympiad. The transition from competition-level problem-solving to professional research, however, requires navigating vast literature and constructing long-horizon proofs. In this work, we introduce Aletheia, a math research agent that iteratively generates, verifies, and revises solutions end-to-end in natural language. Specifically, Aletheia is powered by an advanced version of Gemini Deep Think for challenging reasoning problems, a novel inference-time scaling law that extends beyond Olympiad-level problems, and intensive tool use to navigate the complexities of mathematical research. We demonstrate the capability of Aletheia from Olympiad problems to PhD-level exercises and most notably, through several distinct milestones in AI-assisted mathematics research: (a) a research paper (Feng26) generated by AI without any human intervention in calculating certain structure constants in arithmetic geometry called eigenweights; (b) a research paper (LeeSeo26) demonstrating human-AI collaboration in proving bounds on systems of interacting particles called independent sets; and (c) an extensive semi-autonomous evaluation (Feng et al., 2026a) of 700 open problems on Bloom's Erdos Conjectures database, including autonomous solutions to four open questions. In order to help the public better understand the developments pertaining to AI and mathematics, we suggest quantifying standard levels of autonomy and novelty of AI-assisted results, as well as propose a novel concept of human-AI interaction cards for transparency. We conclude with reflections on human-AI collaboration in mathematics and share all prompts as well as model outputs at https://github.com/google-deepmind/superhuman/tree/main/aletheia.

Problem

Research questions and friction points this paper is trying to address.

autonomous mathematics research

AI-assisted theorem proving

open problem solving

mathematical reasoning

human-AI collaboration

Innovation

Methods, ideas, or system contributions that make the work stand out.

mathematical reasoning agent

inference-time scaling law

autonomous theorem proving