Teaching Transformers Modular Arithmetic at Scale

📅 2024-10-04

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Machine learning models struggle to capture large-scale modular arithmetic—particularly for high-dimensional LWE instances with large dimension $N$ and modulus $q$—due to the inherent cyclic structure and combinatorial complexity of modular operations. Method: This paper proposes a scalable Transformer architecture featuring: (i) synthesis of highly diverse modular arithmetic training data; (ii) angular embedding to explicitly encode the circular topology of modular reduction; and (iii) a cyclic-consistency loss that enforces periodic constraints inherent in modular addition. Contribution/Results: Our approach achieves stable, high-accuracy modular addition prediction for practical LWE parameters ($N=256$, $q=3329$) for the first time—substantially outperforming prior state-of-the-art methods limited to $N leq 6$ and $q leq 1000$. The framework demonstrates strong generalizability and is readily transferable to other modular-arithmetic-intensive cryptanalytic tasks.

Technology Category

Application Category

📝 Abstract

Modular addition is, on its face, a simple operation: given $N$ elements in $mathbb{Z}_q$, compute their sum modulo $q$. Yet, scalable machine learning solutions to this problem remain elusive: prior work trains ML models that sum $N le 6$ elements mod $q le 1000$. Promising applications of ML models for cryptanalysis-which often involve modular arithmetic with large $N$ and $q$-motivate reconsideration of this problem. This work proposes three changes to the modular addition model training pipeline: more diverse training data, an angular embedding, and a custom loss function. With these changes, we demonstrate success with our approach for $N = 256, q = 3329$, a case which is interesting for cryptographic applications, and a significant increase in $N$ and $q$ over prior work. These techniques also generalize to other modular arithmetic problems, motivating future work.

Problem

Research questions and friction points this paper is trying to address.

Enhancing ML attacks on Learning with Errors cryptography

Improving ML models' performance on modular arithmetic tasks

Addressing training difficulties with custom data and loss functions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Custom training data distributions for modular arithmetic

Carefully designed loss function for problem structure

Enabling ML models to sum 128 elements modulo q

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Machine Learning Research Engineer

Booz Allen Hamilton

$99,000.00 to $225,000.00 (annualized USD)

Remote

Machine Learning Engineer