🤖 AI Summary
Binary neural networks (BNNs) suffer from limited feature representation capacity due to binary weights and activations, resulting in substantially lower accuracy than full-precision models on complex tasks. To address this, we propose a differentiable, lightweight Expand-Squeeze operation—the first method to break the feature map value bottleneck in BNNs without increasing binary hardware overhead. Our approach enhances binarization via gradient approximation and channel-wise adaptive scaling, thereby improving feature diversity. The operation is architecture-agnostic, seamlessly integrating into both CNNs and Transformers, and compatible with standard BNN training pipelines. Extensive experiments demonstrate state-of-the-art performance across diverse tasks—including image classification, object detection, and diffusion models—achieving absolute accuracy gains of 3.2–5.7 percentage points over prior methods, while incurring less than 0.5% additional computational cost.
📝 Abstract
While binary neural networks (BNNs) offer significant benefits in terms of speed, memory and energy, they encounter substantial accuracy degradation in challenging tasks compared to their real-valued counterparts. Due to the binarization of weights and activations, the possible values of each entry in the feature maps generated by BNNs are strongly constrained. To tackle this limitation, we propose the expanding-and-shrinking operation, which enhances binary feature maps with negligible increase of computation complexity, thereby strengthening the representation capacity. Extensive experiments conducted on multiple benchmarks reveal that our approach generalizes well across diverse applications ranging from image classification, object detection to generative diffusion model, while also achieving remarkable improvement over various leading binarization algorithms based on different architectures including both CNNs and Transformers.