Semantic Search over 9 Million Mathematical Theorems

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of existing mathematical literature retrieval systems, which typically return entire papers and lack support for fine-grained, theorem-level semantic queries. The authors construct a unified corpus comprising 9.2 million human-written, research-grade theorem statements—the largest publicly available dataset of its kind—and demonstrate, for the first time, efficient semantic retrieval at the scale of tens of millions of mathematical theorems. By representing theorems as natural language descriptions, they systematically evaluate the impact of context formulation, language models, embedding architectures, and prompting strategies on retrieval performance, and develop an end-to-end semantic search system. Evaluated on a query set curated by professional mathematicians, their approach significantly outperforms existing baselines in both theorem-level and paper-level retrieval tasks. The code and dataset are publicly released.

Technology Category

Application Category

📝 Abstract
Searching for mathematical results remains difficult: most existing tools retrieve entire papers, while mathematicians and theorem-proving agents often seek a specific theorem, lemma, or proposition that answers a query. While semantic search has seen rapid progress, its behavior on large, highly technical corpora such as research-level mathematical theorems remains poorly understood. In this work, we introduce and study semantic theorem retrieval at scale over a unified corpus of $9.2$ million theorem statements extracted from arXiv and seven other sources, representing the largest publicly available corpus of human-authored, research-level theorems. We represent each theorem with a short natural-language description as a retrieval representation and systematically analyze how representation context, language model choice, embedding model, and prompting strategy affect retrieval quality. On a curated evaluation set of theorem-search queries written by professional mathematicians, our approach substantially improves both theorem-level and paper-level retrieval compared to existing baselines, demonstrating that semantic theorem search is feasible and effective at web scale. The theorem search tool is available at \href{https://huggingface.co/spaces/uw-math-ai/theorem-search}{this link}, and the dataset is available at \href{https://huggingface.co/datasets/uw-math-ai/TheoremSearch}{this link}.
Problem

Research questions and friction points this paper is trying to address.

semantic search
mathematical theorems
theorem retrieval
research-level mathematics
information retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

semantic theorem retrieval
mathematical theorems
large-scale corpus
natural-language representation
embedding models
🔎 Similar Papers
No similar papers found.
L
Luke Alexander
Math AI Lab, University of Washington, Seattle, United States, Department of Mathematics, University of Washington, Seattle, United States
E
Eric Leonen
Math AI Lab, University of Washington, Seattle, United States, Department of Applied and Computational Mathematical Sciences, University of Washington, Seattle, United States
S
Sophie Szeto
Math AI Lab, University of Washington, Seattle, United States, Department of Mathematics, University of Washington, Seattle, United States, Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, United States
A
Artemii Remizov
Math AI Lab, University of Washington, Seattle, United States, Lake Washington High School, Kirkland, United States
I
Ignacio Tejeda
Math AI Lab, University of Washington, Seattle, United States, Department of Mathematics, University of Washington, Seattle, United States
G
Giovanni Inchiostro
Math AI Lab, University of Washington, Seattle, United States, Department of Mathematics, University of Washington, Seattle, United States
Vasily Ilin
Vasily Ilin
University of Washington
samplingneural networksLandau equation