🤖 AI Summary
Edge devices, constrained by limited resources and power budgets, demand highly efficient and lightweight neural network acceleration solutions. Existing quantization or binarization approaches often rely on conventional computing paradigms, struggling to balance accuracy and efficiency. Inspired by Kolmogorov–Arnold Networks (KANs), this work proposes BiKA—an architecture that, for the first time, translates the KAN concept into a hardware-friendly design. BiKA eliminates multiplications entirely, replacing nonlinear functions with learnable binary thresholds and implementing a systolic-like dataflow using only comparators and accumulators. A prototype implemented on the Ultra96-V2 platform demonstrates significant hardware savings: 27.73% and 51.54% reductions in resource utilization compared to state-of-the-art binarized and quantized accelerators, respectively, while maintaining competitive accuracy.
📝 Abstract
Lightweight neural network accelerators are essential for edge devices with limited resources and power constraints. While quantization and binarization can efficiently reduce hardware cost, they still rely on the conventional Artificial Neural Network (ANN) computation pattern. The recently proposed Kolmogorov-Arnold Network (KAN) presents a novel network paradigm built on learnable nonlinear functions. However, it is computationally expensive for hardware deployment. Inspired by KAN, we propose BiKA, a multiply-free architecture that replaces nonlinear functions with binary, learnable thresholds, introducing an extremely lightweight computational pattern that requires only comparators and accumulators. Our FPGA prototype on Ultra96-V2 shows that BiKA reduces hardware resource usage by 27.73% and 51.54% compared with binarized and quantized neural network systolic array accelerators, while maintaining competitive accuracy. BiKA provides a promising direction for hardware-friendly neural network design on edge devices.