Two-Valued Symmetric Circulant Matrices: Applications in Deep Learning

📅 2026-05-14

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the storage, computational, and power bottlenecks imposed by the large parameter count of fully connected layers when deploying deep neural networks on resource-constrained devices. To this end, the authors propose an extremely sparse fully connected layer architecture based on binary symmetric circulant matrices. By enforcing strict structured sparsity, the method enables highly efficient parameter sharing, constructing the entire weight matrix using only two learnable parameters—without requiring specialized hardware or complex sparse training procedures. Experiments on the MNIST and MIT-BIH datasets demonstrate that the proposed approach reduces the number of parameters in fully connected layers by more than 80× while incurring only a marginal drop in accuracy, making it highly suitable for low-power applications such as edge computing and TinyML.

📝 Abstract

Despite the success of deep neural networks in vision, medical diagnosis, and IoT scenarios, their deployment on resource-limited platforms poses serious challenges due to their high storage requirements, computational complexity, and large footprint. In particular, fully connected layers require a large number of weights, making it difficult for edge devices to accommodate them. To overcome these challenges associated with limited platforms, this paper proposes the Two-Valued Symmetric Circulant Matrix (TVSCM), a very sparse architecture that employs just two weights per layer to keep it circulant and symmetric. The extreme form of structured sparse architecture provides negligible storage costs compared to traditional full-weight storage. Instead of hardware and additional stages of other traditional sparse learning techniques, such as low-rank approximation and pruning approaches, this architecture provides an extreme form of sparsity, achieving very minimal storage requirements. The simulation study demonstrates more than 80$\times$ reduction in model parameters, reducing parameters from 623,290 to 7,852 on MNIST and from 24,709 to 942 on the MIT-BIH arrhythmia dataset, while maintaining comparable accuracy from 97.6% to 93.5% on MNIST and from 97.6% to 93.1% on MIT-BIH. Due to its minimal architectural requirements and very low power consumption, this architecture would be ideal for edge computing platforms, tiny-ML platforms, IoMT systems, and battery-powered systems.

Problem

Research questions and friction points this paper is trying to address.

deep learning

resource-limited platforms

fully connected layers

model compression

edge computing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-Valued Symmetric Circulant Matrix

structured sparsity

edge computing