🤖 AI Summary
Single-shot learning aims to generalize to unseen categories using only one annotated sample per novel class. This paper proposes a novel method integrating subspace decomposition with transductive inference: images are modeled as linear combinations of latent semantic primitives, and cross-category label propagation is achieved via geometric structure disentanglement. To our knowledge, this is the first work to unify transductive learning with subspace decomposition, enabling single-sample-driven discovery and transfer of semantic primitives. The approach is compatible with multiple backbone architectures—including ResNet and Vision Transformers—and achieves significant performance gains over state-of-the-art single-shot learning methods across diverse benchmark datasets. Empirical results demonstrate superior generalization capability and robustness, validating both the efficacy of the semantic decomposition paradigm and the benefits of transductive inference in extreme low-data regimes.
📝 Abstract
One-shot learning focuses on adapting pretrained models to recognize newly introduced and unseen classes based on a single labeled image. While variations of few-shot and zero-shot learning exist, one-shot learning remains a challenging yet crucial problem due to its ability to generalize knowledge to unseen classes from just one human-annotated image. In this paper, we introduce a transductive one-shot learning approach that employs subspace decomposition to utilize the information from labeled images in the support set and unlabeled images in the query set. These images are decomposed into a linear combination of latent variables representing primitives captured by smaller subspaces. By representing images in the query set as linear combinations of these latent primitives, we can propagate the label from a single image in the support set to query images that share similar combinations of primitives. Through a comprehensive quantitative analysis across various neural network feature extractors and datasets, we demonstrate that our approach can effectively generalize to novel classes from just one labeled image.