Teaching Transformers Modular Arithmetic at Scale

📅 2024-10-04
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Machine learning models struggle to capture large-scale modular arithmetic—particularly for high-dimensional LWE instances with large dimension $N$ and modulus $q$—due to the inherent cyclic structure and combinatorial complexity of modular operations. Method: This paper proposes a scalable Transformer architecture featuring: (i) synthesis of highly diverse modular arithmetic training data; (ii) angular embedding to explicitly encode the circular topology of modular reduction; and (iii) a cyclic-consistency loss that enforces periodic constraints inherent in modular addition. Contribution/Results: Our approach achieves stable, high-accuracy modular addition prediction for practical LWE parameters ($N=256$, $q=3329$) for the first time—substantially outperforming prior state-of-the-art methods limited to $N leq 6$ and $q leq 1000$. The framework demonstrates strong generalizability and is readily transferable to other modular-arithmetic-intensive cryptanalytic tasks.

Technology Category

Application Category

📝 Abstract
Modular addition is, on its face, a simple operation: given $N$ elements in $mathbb{Z}_q$, compute their sum modulo $q$. Yet, scalable machine learning solutions to this problem remain elusive: prior work trains ML models that sum $N le 6$ elements mod $q le 1000$. Promising applications of ML models for cryptanalysis-which often involve modular arithmetic with large $N$ and $q$-motivate reconsideration of this problem. This work proposes three changes to the modular addition model training pipeline: more diverse training data, an angular embedding, and a custom loss function. With these changes, we demonstrate success with our approach for $N = 256, q = 3329$, a case which is interesting for cryptographic applications, and a significant increase in $N$ and $q$ over prior work. These techniques also generalize to other modular arithmetic problems, motivating future work.
Problem

Research questions and friction points this paper is trying to address.

Enhancing ML attacks on Learning with Errors cryptography
Improving ML models' performance on modular arithmetic tasks
Addressing training difficulties with custom data and loss functions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Custom training data distributions for modular arithmetic
Carefully designed loss function for problem structure
Enabling ML models to sum 128 elements modulo q
🔎 Similar Papers
No similar papers found.
E
Eshika Saxena
FAIR, Meta
A
Alberto Alfarano
FAIR, Meta
Emily Wenger
Emily Wenger
Duke University
Machine LearningSecurityPrivacy
K
Kristin Lauter
FAIR, Meta