🤖 AI Summary
This work addresses the challenge of deploying deep neural networks on edge devices, where dense linear operators incur prohibitive memory bandwidth and computational costs. The authors propose a novel approach that integrates quaternion channel coupling with a block-circulant structure to construct linear and convolutional layers of low shift rank. For the first time, they introduce a complex adjoint representation to enable efficient FFT-based inference. This design achieves substantial improvements in parameter compression and computational efficiency while preserving model accuracy. Experimental results demonstrate that the proposed method attains high compression ratios and competitive accuracy across multiple benchmarks—including CIFAR-10/100, SVHN, and hyperspectral image datasets—with consistently low and stable inference latency on both CPU and GPU platforms.
📝 Abstract
Deploying deep neural networks on edge devices is often limited by the memory traffic and compute cost of dense linear operators. While quaternion neural networks improve parameter efficiency by coupling multiple channels through Hamilton products, they typically retain unstructured dense weights; conversely, structured matrices enable fast computation but are usually applied in the real domain. This paper introduces EdgeLDR, a practical framework for quaternion block-circulant linear and convolutional layers that combines quaternion channel mixing with block-circulant parameter structure and enables FFT-based evaluation through the complex adjoint representation. We present reference implementations of EdgeLDR layers and compare FFT-based computation against a naive spatial-domain realization of quaternion circulant products. FFT evaluation yields large empirical speedups over the naive implementation and keeps latency stable as block size increases, making larger compression factors computationally viable. We further integrate EdgeLDR layers into compact CNN and Transformer backbones and evaluate accuracy-compression trade-offs on 32x32 RGB classification (CIFAR-10/100, SVHN) and hyperspectral image classification (Houston 2013, Pavia University), reporting parameter counts and CPU/GPU latency. The results show that EdgeLDR layers provide significant compression with competitive accuracy.