Pre-trained Language Models Learn Remarkably Accurate Representations of Numbers

๐Ÿ“… 2025-06-10
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Pretrained language models (PLMs) are commonly perceived as lacking precise numerical representation capabilities, leading to arithmetic errors; however, existing probing methods fail to account for the inherent sinusoidal periodic structure in PLM embeddings, thereby underestimating actual numerical encoding fidelity. Method: We propose SineProbeโ€”a sinusoidal pattern-aware probe that, for the first time, models and disentangles the periodic structure of numeric embeddings in the frequency domain, enabling high-fidelity decoding of numeric values from input embeddings. Results: SineProbe achieves >99% numeric decoding accuracy across multiple open-source PLMs; the decoded embedding precision explains over 70% of fundamental arithmetic errors, establishing a causal link between representation fidelity and error occurrence; after lightweight alignment fine-tuning, addition and subtraction error rates drop significantly. This work challenges the prevailing assumption that โ€œPLMs are inherently poor at arithmetic,โ€ offering a new paradigm for interpretable numerical representation and arithmetic capability enhancement.

Technology Category

Application Category

๐Ÿ“ Abstract
Pretrained language models (LMs) are prone to arithmetic errors. Existing work showed limited success in probing numeric values from models' representations, indicating that these errors can be attributed to the inherent unreliability of distributionally learned embeddings in representing exact quantities. However, we observe that previous probing methods are inadequate for the emergent structure of learned number embeddings with sinusoidal patterns. In response, we propose a novel probing technique that decodes numeric values from input embeddings with near-perfect accuracy across a range of open-source LMs. This proves that after the sole pre-training, LMs represent numbers with remarkable precision. Finally, we find that the embeddings' preciseness judged by our probe's accuracy explains a large portion of LM's errors in elementary arithmetic, and show that aligning the embeddings with the pattern discovered by our probe can mitigate these errors.
Problem

Research questions and friction points this paper is trying to address.

Probing numeric values from language models' representations inadequately
Decoding numeric values accurately from embeddings in LMs
Mitigating arithmetic errors by aligning embeddings with patterns
Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel probing technique decodes numeric values accurately
Pre-trained LMs represent numbers with high precision
Aligning embeddings mitigates arithmetic errors effectively
๐Ÿ”Ž Similar Papers
No similar papers found.
M
Marek Kadlvc'ik
TransformersClub @ Faculty of Informatics, Masaryk University
M
Michal vStef'anik
Language Technology, University of Helsinki
Timothee Mickus
Timothee Mickus
University of Helsinki
NLGNLPDistributional SemanticsWord Embeddings
Michal Spiegel
Michal Spiegel
Masaryk University, Brno; Kempelen Institute of Intelligent Technologies, Bratislava
computer scienceartificial intelligencenatural language processinglarge language models
J
Josef Kuchavr
TransformersClub @ Faculty of Informatics, Masaryk University