🤖 AI Summary
To address the high computational overhead and low efficiency of CNN inference under fully homomorphic encryption (FHE), this paper introduces Coefficients-in-Slot (CinS), a novel encoding paradigm. We first establish its mathematical equivalence to the pre-processing steps of the discrete Fourier transform (DFT), enabling the design of a DFT-aware bootstrapping mechanism. Building on this insight, we jointly optimize cryptographic and computational operations by integrating CKKS scheme enhancements with a customized conv2d-activation execution flow. Experimental results demonstrate a 5.68× speedup for the conv2d-activation sequence and, for the first time, end-to-end private inference on ImageNet-scale models within seconds. This work significantly advances the practicality of FHE-based CNN inference.
📝 Abstract
Fully homomorphic encryption (FHE) is a promising cryptographic primitive for realizing private neural network inference (PI) services by allowing a client to fully offload the inference task to a cloud server while keeping the client data oblivious to the server. This work proposes NeuJeans, an FHE-based solution for the PI of deep convolutional neural networks (CNNs). NeuJeans tackles the critical problem of the enormous computational cost for the FHE evaluation of CNNs. We introduce a novel encoding method called Coefficients-in-Slot (CinS) encoding, which enables multiple convolutions in one HE multiplication without costly slot permutations. We further observe that CinS encoding is obtained by conducting the first several steps of the Discrete Fourier Transform (DFT) on a ciphertext in conventional Slot encoding. This property enables us to save the conversion between CinS and Slot encodings as bootstrapping a ciphertext starts with DFT. Exploiting this, we devise optimized execution flows for various two-dimensional convolution (conv2d) operations and apply them to end-to-end CNN implementations. NeuJeans accelerates the performance of conv2d-activation sequences by up to 5.68 times compared to state-of-the-art FHE-based PI work and performs the PI of a CNN at the scale of ImageNet within a mere few seconds.