π€ AI Summary
This work addresses the well-known limitation of convolutional neural networks (CNNs)βtheir sensitivity to input spatial translations due to downsampling operations that break translation invariance. To resolve this, the authors propose Gaussian-Hermite Sampling (GHS), a novel downsampling strategy that incorporates Gaussian-Hermite polynomials into the sampling process. Remarkably, GHS endows standard CNNs with strict translation invariance from the outset of training, without requiring any architectural modifications or changes to the training procedure. The method achieves 100% translation classification consistency on CIFAR-10, CIFAR-100, and MNIST-rot benchmarks while simultaneously improving overall classification accuracy, demonstrating that enforcing theoretical invariance at the sampling stage can yield both robustness and performance gains.
π Abstract
The convolutional neural networks (CNNs) are not inherently shift invariant or equivariant. The downsampling operation, used in CNNs, is one of the key reasons which breaks the shift invariant property of a CNN. Conversely, downsampling operation is important to improve computational efficiency and increase the area of the receptive field for more contextual information. In this work, we propose Gaussian-Hermite Sampling (GHS), a novel downsampling strategy designed to achieve accurate shift invariance. GHS leverages Gaussian-Hermite polynomials to perform shift-consistent sampling, enabling CNN layers to maintain invariance to arbitrary spatial shifts prior to training. When integrated into standard CNN architectures, the proposed method embeds shift invariance directly at the layer level without requiring architectural modifications or additional training procedures. We evaluate the proposed approach on CIFAR-10, CIFAR-100, and MNIST-rot datasets. Experimental results demonstrate that GHS significantly improves shift consistency, achieving 100% classification consistency under spatial shifts, while also improving classification accuracy compared to baseline CNN models.