🤖 AI Summary
Binary neural networks (BNNs) suffer from optimization difficulties and limited accuracy. To address these challenges, this work pioneers the integration of hyperbolic geometry into BNN training, proposing HyperBNN—a novel BNN framework—and Exponential Parameterization Clustering (EPC), a weight quantization strategy. Leveraging a diffeomorphic mapping, HyperBNN embeds weight-constrained optimization into the hyperbolic space, while Riemannian exponential mapping enables efficient gradient-based optimization in Euclidean space. EPC further enhances weight flipping probability during training to maximize information gain. Evaluated on CIFAR-10, CIFAR-100, and ImageNet with VGG-small, ResNet-18, and ResNet-34 architectures, HyperBNN achieves state-of-the-art (SOTA) accuracy across all benchmarks—outperforming existing BNN methods by significant margins. This work establishes a new paradigm for deploying highly efficient, accurate neural networks on resource-constrained edge devices.
📝 Abstract
Binary neural network (BNN) converts full-precision weights and activations into their extreme 1-bit counterparts, making it particularly suitable for deployment on lightweight mobile devices. While BNNs are typically formulated as a constrained optimization problem and optimized in the binarized space, general neural networks are formulated as an unconstrained optimization problem and optimized in the continuous space. This article introduces the hyperbolic BNN (HBNN) by leveraging the framework of hyperbolic geometry to optimize the constrained problem. Specifically, we transform the constrained problem in hyperbolic space into an unconstrained one in Euclidean space using the Riemannian exponential map. On the other hand, we also propose the exponential parametrization cluster (EPC) method, which, compared with the Riemannian exponential map, shrinks the segment domain based on a diffeomorphism. This approach increases the probability of weight flips, thereby maximizing the information gain in BNNs. Experimental results on CIFAR10, CIFAR100, and ImageNet classification datasets with VGGsmall, ResNet18, and ResNet34 models illustrate the superior performance of our HBNN over state-of-the-art methods.