🤖 AI Summary
Ultra-fine-grained visual classification poses significant challenges due to extremely limited data and minimal inter-class visual differences. This work proposes the Geometry Attribute Exploration Network (GAEor), which, for the first time, models intrinsic object details aligned with geometric structures—such as soybean leaf venation—as geometry-aware attributes to serve as primary discriminative cues. By integrating relative polar coordinate embeddings with a visual feedback mechanism, GAEor effectively captures class-specific geometric patterns while reducing reliance on subtle appearance variations. Leveraging self-supervised learning and deep convolutional architectures, the proposed method achieves state-of-the-art performance across five established ultra-fine-grained benchmarks, substantially outperforming existing approaches.
📝 Abstract
This paper investigates the intrinsic geometrical features of highly similar objects and introduces a general self-supervised framework called the Geometric Attribute Exploration Network (GAEor), which is designed to address the ultra-fine-grained visual categorization (Ultra-FGVC) task in data-limited scenarios. Unlike prior work that often captures subtle yet critical distinctions, GAEor generates geometric attributes as novel alternative recognition cues. These attributes are determined by various details within the object, aligned with its geometric patterns, such as the intricate vein structures in soybean leaves. Crucially, each category exhibits distinct geometric descriptors that serve as powerful cues, even among objects with minimal visual variation -- a factor largely overlooked in recent research. GAEor discovers these geometric attributes by first amplifying geometry-relevant details via visual feedback from a backbone network, then embedding the relative polar coordinates of these details into the final representation. Extensive experiments demonstrate that GAEor significantly sets new state-of-the-art records in five widely-used Ultra-FGVC benchmarks.