🤖 AI Summary
This work addresses the prevalent low-frequency bias in lightweight image classification models by systematically analyzing the impact of gating mechanisms on neural network training dynamics from a frequency-domain perspective. We establish, for the first time, a theoretical frequency-domain interpretation of gating operations—specifically, the coupled element-wise multiplication and nonlinear activation—revealing their collaborative modulation of multi-frequency components. Guided by this analysis, we propose GmNet, a lightweight architecture that minimizes low-frequency bias via a frequency-sensitive information flow control structure, overcoming the empirical limitations of conventional gating designs. Leveraging convolution theorem-based frequency-domain insights for principled model design, GmNet achieves superior accuracy and inference efficiency with fewer parameters on benchmarks including ImageNet, significantly outperforming state-of-the-art lightweight models such as MobileNetV3 and EfficientNet-Lite.
📝 Abstract
Gating mechanisms have emerged as an effective strategy integrated into model designs beyond recurrent neural networks for addressing long-range dependency problems. In a broad understanding, it provides adaptive control over the information flow while maintaining computational efficiency. However, there is a lack of theoretical analysis on how the gating mechanism works in neural networks. In this paper, inspired by the {convolution theorem}, we systematically explore the effect of gating mechanisms on the training dynamics of neural networks from a frequency perspective. We investigate the interact between the element-wise product and activation functions in managing the responses to different frequency components. Leveraging these insights, we propose a Gating Mechanism Network (GmNet), a lightweight model designed to efficiently utilize the information of various frequency components. It minimizes the low-frequency bias present in existing lightweight models. GmNet achieves impressive performance in terms of both effectiveness and efficiency in the image classification task.