🤖 AI Summary
CSIDH key exchange offers exceptionally small key sizes but suffers from computationally intensive key generation and stringent constant-time implementation requirements to resist side-channel attacks—severely hindering hardware deployment. This work proposes the first unified FPGA/ASIC hardware architecture enabling constant-time CSIDH-512 key generation. Our design employs a top-level finite-state machine to orchestrate a deeply pipelined arithmetic logic unit, parallel schoolbook multipliers, and an optimized 512-bit Montgomery modular multiplication unit. On FPGA (200 MHz), key generation completes in 515 ms; on ASIC (180 nm technology, 180 MHz), it requires 591 ms. To the best of our knowledge, this is the first hardware implementation of CSIDH providing a reproducible, comparable performance benchmark. The architecture establishes a foundational reference for hardware acceleration of isogeny-based cryptography and advances the practical deployment of post-quantum cryptographic primitives.
📝 Abstract
The commutative supersingular isogeny Diffie-Hellman (CSIDH) algorithm is a promising post-quantum key exchange protocol, notable for its exceptionally small key sizes, but hindered by computationally intensive key generation. Furthermore, practical implementations must operate in constant time to mitigate side-channel vulnerabilities, which presents an additional performance challenge. This paper presents, to our knowledge, the first comprehensive hardware study of CSIDH, establishing a performance baseline with a unified architecture on both field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) platforms. The architecture features a top-level finite state machine (FSM) that orchestrates a deeply pipelined arithmetic logic unit (ALU) to accelerate the underlying 512-bit finite field operations. The ALU employs a parallelized schoolbook multiplier, completing a 512$ imes$512-bit multiplication in 22 clock cycles and enabling a full Montgomery modular multiplication in 87 cycles. The constant-time CSIDH-512 design requires $1.03 imes10^{8}$ clock cycles per key generation. When implemented on a Xilinx Zynq UltraScale+ FPGA, the architecture achieves a 200 MHz clock frequency, corresponding to a 515 ms latency. For ASIC implementation in a 180nm process, the design requires $1.065 imes10^{8}$ clock cycles and achieves a extasciitilde 180 MHz frequency, resulting in a key generation latency of 591 ms. By providing the first public hardware performance metrics for CSIDH on both FPGA and ASIC platforms, this work delivers a crucial benchmark for future isogeny-based post-quantum cryptography (PQC) accelerators.