FlexMem: High-Parallel Near-Memory Architecture for Flexible Dataflow in Fully Homomorphic Encryption

📅 2025-03-30

📈 Citations: 0

✨ Influential: 0

career value

276K/year

🤖 AI Summary

To address memory bandwidth bottlenecks and inefficient near-memory processing (NMP) caused by irregular memory access patterns in hardware acceleration of fully homomorphic encryption (FHE), this paper proposes FlexMem—a highly parallel near-memory architecture. Our approach introduces: (1) a configurable near-memory computing architecture supporting flexible dataflows, featuring variable-stride memory access and dynamically reconfigurable heterogeneous interconnect topologies; and (2) a dual-granularity scheduling mechanism coordinating polynomial-level and ciphertext-level execution to match FHE’s inherent irregular computation and memory access behavior. Experimental evaluation demonstrates that FlexMem achieves 1.12× higher performance than the state-of-the-art NMP accelerators for FHE, while attaining a near-memory bandwidth utilization of 95.7%, thereby significantly alleviating the memory wall limitation in FHE acceleration.

Technology Category

Application Category

📝 Abstract

Fully Homomorphic Encryption (FHE) imposes substantial memory bandwidth demands, presenting significant challenges for efficient hardware acceleration. Near-memory Processing (NMP) has emerged as a promising architectural solution to alleviate the memory bottleneck. However, the irregular memory access patterns and flexible dataflows inherent to FHE limit the effectiveness of existing NMP accelerators, which fail to fully utilize the available near-memory bandwidth. In this work, we propose FlexMem, a near-memory accelerator featuring high-parallel computational units with varying memory access strides and interconnect topologies to effectively handle irregular memory access patterns. Furthermore, we design polynomial and ciphertext-level dataflows to efficiently utilize near-memory bandwidth under varying degrees of polynomial parallelism and enhance parallel performance. Experimental results demonstrate that FlexMem achieves 1.12 times of performance improvement over state-of-the-art near-memory architectures, with 95.7% of near-memory bandwidth utilization.

Problem

Research questions and friction points this paper is trying to address.

Addresses high memory bandwidth demands in Fully Homomorphic Encryption

Improves irregular memory access patterns in Near-Memory Processing

Enhances near-memory bandwidth utilization for flexible dataflows

Innovation

Methods, ideas, or system contributions that make the work stand out.

High-parallel units with varying memory strides

Polynomial and ciphertext-level dataflows design

Flexible interconnect topologies for irregular accesses

🔎 Similar Papers

Cheddar: A Swift Fully Homomorphic Encryption Library for CUDA GPUs