COMPOSE: Composing Future Theorems from Citations and Formal Structure

📅 2026-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of generating future mathematical conjectures that exhibit both scholarly continuity and formally sound dependency structures. To this end, it introduces the COMPOSE framework, which uniquely integrates scientific citation graphs from arXiv with formal theorem dependency graphs from Mathlib into a dual-graph conditional language model. Trained on 108,000 aligned scientific–formal graph pairs, the proposed approach substantially outperforms strong baselines, demonstrating superior mathematical plausibility and propositional richness in evaluations involving retrieval of actual future publications and assessments by large language models. The results establish a novel paradigm for AI-driven mathematical discovery grounded in both empirical research trajectories and rigorous formal reasoning.
📝 Abstract
A plausible future mathematical claim must satisfy two constraints: it should follow the direction of prior work and respect the formal dependencies that constrain what can validly follow. Existing approaches typically model only one of these sources, producing claims that are either weakly grounded or insufficiently motivated. We introduce grounded future mathematical generation, where the goal is to generate a plausible future theorem-like claim for an anchor paper using two complementary sources of context: its scientific citation graph and aligned formal theorem dependency graph. To address this setting, we propose COMPOSE, a dual-graph framework that conditions a language model on both scientific citation context and formal theorem structure. To support this setting, we construct a dataset of 108K paired scientific-formal graph examples from arXiv and Mathlib, together with a benchmark of 47K future papers from 2024--2025. Experiments show that COMPOSE outperforms strong baselines on retrieval to real future papers and achieves the best overall performance under LLM-judge evaluation, producing more grounded and mathematically richer outputs. These results show that future mathematical generation benefits from combining scientific context with formal structure. Project page is available at https://david-busbib.github.io/COMPOSE-page/.
Problem

Research questions and friction points this paper is trying to address.

future mathematical generation
citation graph
formal theorem dependency
plausible theorem prediction
mathematical claim synthesis
Innovation

Methods, ideas, or system contributions that make the work stand out.

future mathematical generation
dual-graph framework
citation graph
formal theorem dependency
grounded theorem synthesis
🔎 Similar Papers
No similar papers found.