Frequency-Enhanced Dual-Subspace Networks for Few-Shot Fine-Grained Image Classification

📅 2026-04-16
📈 Citations: 0
Influential: 0
📄 PDF

career value

213K/year
🤖 AI Summary
This work addresses the limitations of existing few-shot fine-grained image classification methods that rely solely on spatial-domain features, which often suffer from texture bias, high-frequency noise interference, and unstable metric learning. To overcome these issues, the paper proposes a frequency-enhanced dual-subspace network that, for the first time, incorporates frequency-domain structural information into few-shot fine-grained classification. The approach leverages discrete cosine transform combined with low-pass filtering to decouple low-frequency structural components from spatial textures, constructing two complementary subspaces. It then dynamically fuses their projected distances through truncated singular value decomposition and an adaptive gating mechanism. Evaluated on four standard benchmarks—CUB-200-2011, Stanford Cars, Stanford Dogs, and FGVC-Aircraft—the method achieves state-of-the-art performance, significantly enhancing structural stability, generalization capability, and computational efficiency.

Technology Category

Application Category

📝 Abstract
Few-shot fine-grained image classification aims to recognize subcategories with high visual similarity using only a limited number of annotated samples. Existing metric learning-based methods typically rely solely on spatial domain features. Confined to this single perspective, models inevitably suffer from inherent texture biases, entangling essential structural details with high-frequency background noise. Furthermore, lacking cross-view geometric constraints, single-view metrics tend to overfit this noise, resulting in structural instability under few-shot conditions. To address these issues, this paper proposes the Frequency-Enhanced Dual-Subspace Network (FEDSNet). Specifically, FEDSNet utilizes the Discrete Cosine Transform (DCT) and a low-pass filtering mechanism to explicitly isolate low-frequency global structural components from spatial features, thereby suppressing background interference. Truncated Singular Value Decomposition (SVD) is employed to construct independent, low-rank linear subspaces for both spatial texture and frequency structural features. An adaptive gating mechanism is designed to dynamically fuse the projection distances from these dual views. This strategy leverages the structural stability of the frequency subspace to prevent the spatial subspace from overfitting to background features. Extensive experiments on four benchmark datasets - CUB-200-2011, Stanford Cars, Stanford Dogs, and FGVC-Aircraft - demonstrate that FEDSNet exhibits excellent classification performance and robustness, achieving highly competitive results compared to existing metric learning algorithms. Complexity analysis further confirms that the proposed network achieves a favorable balance between high accuracy and computational efficiency, providing an effective new paradigm for few-shot fine-grained visual recognition.
Problem

Research questions and friction points this paper is trying to address.

few-shot learning
fine-grained image classification
metric learning
texture bias
structural instability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Frequency-Enhanced Dual-Subspace
Discrete Cosine Transform (DCT)
Low-Rank Subspace
Metric Learning
Few-Shot Fine-Grained Classification
🔎 Similar Papers
No similar papers found.
M
Meijia Wang
School of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi’an 710000, China
G
Guochao Wang
School of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi’an 710000, China
H
Haozhen Chu
School of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi’an 710000, China
B
Bin Yao
School of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi’an 710000, China
Weichuan Zhang
Weichuan Zhang
Full Professor, Shaanxi University of Science & Technology
Image ProcessingImage AnalysisPattern RecognitionComputer Vision
Y
Yuan Wang
School of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi’an 710000, China
J
Junpo Yang
School of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi’an 710000, China