🤖 AI Summary
To address overfitting and poor generalization caused by data scarcity in few-shot image classification, this paper proposes SFIFNet—a novel network that explicitly fuses frequency-domain and spatial-domain information during data preprocessing for the first time. Methodologically, SFIFNet leverages frequency transforms (e.g., DCT or FFT) to extract global texture and structural priors, integrates them with multi-scale spatial features via a lightweight deep neural architecture, and further enhances robustness through conventional data augmentation. Its key contribution lies in breaking the prevailing reliance on spatial-domain representations alone, systematically exploiting discriminative and complementary features encoded in the frequency domain. Extensive experiments on standard few-shot benchmarks—including Mini-ImageNet and CUB—demonstrate that SFIFNet achieves significant improvements in classification accuracy (average gain of +2.3%) and superior cross-domain generalization capability.
📝 Abstract
The objective of Few-shot learning is to fully leverage the limited data resources for exploring the latent correlations within the data by applying algorithms and training a model with outstanding performance that can adequately meet the demands of practical applications. In practical applications, the number of images in each category is usually less than that in traditional deep learning, which can lead to over-fitting and poor generalization performance. Currently, many Few-shot classification models pay more attention to spatial domain information while neglecting frequency domain information, which contains more feature information. Ignoring frequency domain information will prevent the model from fully exploiting feature information, which would effect the classification performance. Based on conventional data augmentation, this paper proposes an SFIFNet with innovative data preprocessing. The key of this method is enhancing the accuracy of image feature representation by integrating frequency domain information with spatial domain information. The experimental results demonstrate the effectiveness of this method in enhancing classification performance.