Testing Transformer Learnability on the Arithmetic Sequence of Rooted Trees

📅 2025-12-01

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This study investigates whether large language models can learn deterministic tree sequences generated by iterative prime factorization of natural numbers—each integer maps to a unique planar rooted tree, forming a strictly arithmetic, structured sequence. Method: We introduce the first autoregressive tree language encoding this purely arithmetic structure and train a GPT-2–based Transformer end-to-end on the first 10¹¹ sequence elements. Contribution/Results: The model successfully captures not only basic syntactic constraints but also discovers and predicts nontrivial long-range structural dependencies and regular patterns. This work transcends conventional language modeling paradigms by demonstrating, for the first time, that Transformers can effectively learn noise-free, fully deterministic arithmetic structures. It establishes a novel benchmark for probing the abstract reasoning capabilities and learnability limits of large models, offering a principled framework to assess their capacity for structured mathematical generalization.

Technology Category

Application Category

📝 Abstract

We study whether a Large Language Model can learn the deterministic sequence of trees generated by the iterated prime factorization of the natural numbers. Each integer is mapped into a rooted planar tree and the resulting sequence $ mathbb{N}mathcal{T}$ defines an arithmetic text with measurable statistical structure. A transformer network (the GPT-2 architecture) is trained from scratch on the first $10^{11}$ elements to subsequently test its predictive ability under next-word and masked-word prediction tasks. Our results show that the model partially learns the internal grammar of $mathbb{N}mathcal{T}$, capturing non-trivial regularities and correlations. This suggests that learnability may extend beyond empirical data to the very structure of arithmetic.

Problem

Research questions and friction points this paper is trying to address.

Tests if LLMs learn deterministic tree sequences from prime factorization

Examines transformer predictive ability on arithmetic tree structures

Investigates learnability of arithmetic grammar beyond empirical data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer trained on arithmetic tree sequences

GPT-2 learns internal grammar of deterministic structures

Model captures non-trivial regularities in arithmetic data

🔎 Similar Papers

How transformers learn structured data: insights from hierarchical filtering