🤖 AI Summary
Current AI-driven cryptanalysis of post-quantum cryptography is hindered by the absence of publicly available, standardized datasets for the Learning With Errors (LWE) problem. To address this gap, we introduce TAPAS—the first open-source, benchmark LWE dataset, systematically spanning diverse security parameters, noise distributions, and dimensional configurations. TAPAS employs rigorous cryptographic instance generation, tunable noise modeling, and a unified preprocessing pipeline to deliver high-quality, reproducible, and scalable training samples. It further includes comprehensive baseline evaluations of multiple AI-based attacks, substantially lowering barriers to entry for researchers. Our primary contribution is bridging a critical data void in AI-assisted analysis of post-quantum cryptography, thereby enabling plug-and-play model development, fair cross-method evaluation, and principled methodological innovation.
📝 Abstract
AI-powered attacks on Learning with Errors (LWE), an important hard math problem in post-quantum cryptography, rival or outperform "classical" attacks on LWE under certain parameter settings. Despite the promise of this approach, a dearth of accessible data limits AI practitioners' ability to study and improve these attacks. Creating LWE data for AI model training is time- and compute-intensive and requires significant domain expertise. To fill this gap and accelerate AI research on LWE attacks, we propose the TAPAS datasets, a Toolkit for Analysis of Post-quantum cryptography using AI Systems. These datasets cover several LWE settings and can be used off-the-shelf by AI practitioners to prototype new approaches to cracking LWE. This work documents TAPAS dataset creation, establishes attack performance baselines, and lays out directions for future work.